Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 45366 |
| Missing cells | 94468 |
| Missing cells (%) | 9.5% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.6 MiB |
| Average record size in memory | 176.0 B |
Variable types
| Categorical | 12 |
|---|---|
| Numeric | 9 |
| DateTime | 1 |
belongs_to_collection has a high cardinality: 1695 distinct values | High cardinality |
genres has a high cardinality: 4064 distinct values | High cardinality |
original_language has a high cardinality: 89 distinct values | High cardinality |
overview has a high cardinality: 44231 distinct values | High cardinality |
production_companies has a high cardinality: 22666 distinct values | High cardinality |
production_countries has a high cardinality: 2388 distinct values | High cardinality |
spoken_languages has a high cardinality: 1841 distinct values | High cardinality |
tagline has a high cardinality: 20269 distinct values | High cardinality |
title has a high cardinality: 42195 distinct values | High cardinality |
cast has a high cardinality: 42656 distinct values | High cardinality |
crew has a high cardinality: 42943 distinct values | High cardinality |
budget is highly overall correlated with revenue and 1 other fields | High correlation |
popularity is highly overall correlated with vote_count | High correlation |
revenue is highly overall correlated with budget and 2 other fields | High correlation |
vote_count is highly overall correlated with popularity and 1 other fields | High correlation |
return is highly overall correlated with budget and 1 other fields | High correlation |
original_language is highly imbalanced (67.4%) | Imbalance |
production_countries is highly imbalanced (57.7%) | Imbalance |
spoken_languages is highly imbalanced (62.0%) | Imbalance |
status is highly imbalanced (97.0%) | Imbalance |
belongs_to_collection has 40878 (90.1%) missing values | Missing |
genres has 2383 (5.3%) missing values | Missing |
overview has 941 (2.1%) missing values | Missing |
production_companies has 11792 (26.0%) missing values | Missing |
production_countries has 6208 (13.7%) missing values | Missing |
spoken_languages has 3888 (8.6%) missing values | Missing |
tagline has 24970 (55.0%) missing values | Missing |
cast has 2348 (5.2%) missing values | Missing |
crew has 723 (1.6%) missing values | Missing |
popularity is highly skewed (γ1 = 29.21456901) | Skewed |
return is highly skewed (γ1 = 138.3142814) | Skewed |
overview is uniformly distributed | Uniform |
tagline is uniformly distributed | Uniform |
title is uniformly distributed | Uniform |
cast is uniformly distributed | Uniform |
crew is uniformly distributed | Uniform |
budget has 36477 (80.4%) zeros | Zeros |
popularity has 1428 (3.1%) zeros | Zeros |
revenue has 37958 (83.7%) zeros | Zeros |
runtime has 1534 (3.4%) zeros | Zeros |
vote_average has 2947 (6.5%) zeros | Zeros |
vote_count has 2849 (6.3%) zeros | Zeros |
return has 40043 (88.3%) zeros | Zeros |
Reproduction
| Analysis started | 2023-06-09 23:19:48.070327 |
|---|---|
| Analysis finished | 2023-06-09 23:20:23.442462 |
| Duration | 35.37 seconds |
| Software version | pandas-profiling v3.6.6 |
| Download configuration | config.json |
belongs_to_collection
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 1695 |
|---|---|
| Distinct (%) | 37.8% |
| Missing | 40878 |
| Missing (%) | 90.1% |
| Memory size | 354.5 KiB |
| The Bowery Boys | 29 |
|---|---|
| Totò Collection | 27 |
| James Bond Collection | 26 |
| Zatôichi: The Blind Swordsman | 26 |
| The Carry On Collection | 25 |
| Other values (1690) |
Length
| Max length | 54 |
|---|---|
| Median length | 43 |
| Mean length | 23.855838 |
| Min length | 3 |
Characters and Unicode
| Total characters | 107065 |
|---|---|
| Distinct characters | 166 |
| Distinct categories | 12 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 390 ? |
|---|---|
| Unique (%) | 8.7% |
Sample
| 1st row | Toy Story Collection |
|---|---|
| 2nd row | Grumpy Old Men Collection |
| 3rd row | Father of the Bride Collection |
| 4th row | James Bond Collection |
| 5th row | Balto Collection |
Common Values
| Value | Count | Frequency (%) |
| The Bowery Boys | 29 | 0.1% |
| Totò Collection | 27 | 0.1% |
| James Bond Collection | 26 | 0.1% |
| Zatôichi: The Blind Swordsman | 26 | 0.1% |
| The Carry On Collection | 25 | 0.1% |
| Pokémon Collection | 22 | < 0.1% |
| Charlie Chan (Sidney Toler) Collection | 21 | < 0.1% |
| Godzilla (Showa) Collection | 16 | < 0.1% |
| Dragon Ball Z (Movie) Collection | 15 | < 0.1% |
| Uuno Turhapuro | 15 | < 0.1% |
| Other values (1685) | 4266 | 9.4% |
| (Missing) | 40878 |
Length
| Value | Count | Frequency (%) |
| collection | 3743 | |
| the | 1146 | 7.8% |
| of | 230 | 1.6% |
| series | 147 | 1.0% |
| 139 | 0.9% | |
| trilogy | 87 | 0.6% |
| and | 84 | 0.6% |
| man | 62 | 0.4% |
| a | 62 | 0.4% |
| in | 56 | 0.4% |
| Other values (2407) | 9028 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 11114 | 10.4% |
| e | 10450 | 9.8% |
| 10297 | 9.6% | |
| l | 10200 | 9.5% |
| i | 7559 | 7.1% |
| n | 7403 | 6.9% |
| t | 6488 | 6.1% |
| c | 4845 | 4.5% |
| C | 4474 | 4.2% |
| a | 4459 | 4.2% |
| Other values (156) | 29776 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 81103 | |
| Uppercase Letter | 13885 | 13.0% |
| Space Separator | 10297 | 9.6% |
| Other Punctuation | 576 | 0.5% |
| Open Punctuation | 335 | 0.3% |
| Close Punctuation | 335 | 0.3% |
| Decimal Number | 321 | 0.3% |
| Dash Punctuation | 162 | 0.2% |
| Other Letter | 37 | < 0.1% |
| Final Punctuation | 9 | < 0.1% |
| Other values (2) | 5 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 11114 | |
| e | 10450 | |
| l | 10200 | |
| i | 7559 | |
| n | 7403 | |
| t | 6488 | |
| c | 4845 | 6.0% |
| a | 4459 | 5.5% |
| r | 3870 | 4.8% |
| s | 2588 | 3.2% |
| Other values (69) | 12127 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 4474 | |
| T | 1527 | 11.0% |
| S | 1063 | 7.7% |
| B | 682 | 4.9% |
| M | 630 | 4.5% |
| A | 509 | 3.7% |
| D | 505 | 3.6% |
| H | 462 | 3.3% |
| P | 432 | 3.1% |
| G | 417 | 3.0% |
| Other values (33) | 3184 |
Other Letter
| Value | Count | Frequency (%) |
| つ | 3 | 8.1% |
| は | 3 | 8.1% |
| よ | 3 | 8.1% |
| シ | 3 | 8.1% |
| リ | 3 | 8.1% |
| ら | 3 | 8.1% |
| い | 3 | 8.1% |
| ズ | 3 | 8.1% |
| 男 | 3 | 8.1% |
| 식 | 2 | 5.4% |
| Other values (4) | 8 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 172 | |
| ' | 107 | |
| : | 99 | |
| , | 79 | |
| & | 52 | 9.0% |
| ! | 35 | 6.1% |
| / | 21 | 3.6% |
| ? | 4 | 0.7% |
| * | 4 | 0.7% |
| … | 3 | 0.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 80 | |
| 9 | 64 | |
| 3 | 54 | |
| 0 | 51 | |
| 2 | 21 | 6.5% |
| 8 | 13 | 4.0% |
| 5 | 12 | 3.7% |
| 7 | 11 | 3.4% |
| 6 | 10 | 3.1% |
| 4 | 5 | 1.6% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 330 | |
| [ | 5 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 330 | |
| ] | 5 | 1.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 160 | |
| – | 2 | 1.2% |
Space Separator
| Value | Count | Frequency (%) |
| 10297 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 |
Modifier Letter
| Value | Count | Frequency (%) |
| ー | 3 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 94574 | |
| Common | 12040 | 11.2% |
| Cyrillic | 414 | 0.4% |
| Hiragana | 15 | < 0.1% |
| Hangul | 10 | < 0.1% |
| Katakana | 9 | < 0.1% |
| Han | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 11114 | |
| e | 10450 | |
| l | 10200 | |
| i | 7559 | 8.0% |
| n | 7403 | 7.8% |
| t | 6488 | 6.9% |
| c | 4845 | 5.1% |
| C | 4474 | 4.7% |
| a | 4459 | 4.7% |
| r | 3870 | 4.1% |
| Other values (70) | 23712 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
Common
| Value | Count | Frequency (%) |
| 10297 | ||
| ( | 330 | 2.7% |
| ) | 330 | 2.7% |
| . | 172 | 1.4% |
| - | 160 | 1.3% |
| ' | 107 | 0.9% |
| : | 99 | 0.8% |
| 1 | 80 | 0.7% |
| , | 79 | 0.7% |
| 9 | 64 | 0.5% |
| Other values (20) | 322 | 2.7% |
Hiragana
| Value | Count | Frequency (%) |
| つ | 3 | |
| は | 3 | |
| よ | 3 | |
| ら | 3 | |
| い | 3 |
Hangul
| Value | Count | Frequency (%) |
| 식 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 즈 | 2 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ズ | 3 |
Han
| Value | Count | Frequency (%) |
| 男 | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 106351 | |
| Cyrillic | 414 | 0.4% |
| None | 246 | 0.2% |
| Hiragana | 15 | < 0.1% |
| Punctuation | 14 | < 0.1% |
| Katakana | 12 | < 0.1% |
| Hangul | 10 | < 0.1% |
| CJK | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 11114 | 10.5% |
| e | 10450 | 9.8% |
| 10297 | 9.7% | |
| l | 10200 | 9.6% |
| i | 7559 | 7.1% |
| n | 7403 | 7.0% |
| t | 6488 | 6.1% |
| c | 4845 | 4.6% |
| C | 4474 | 4.2% |
| a | 4459 | 4.2% |
| Other values (67) | 29062 |
Cyrillic
| Value | Count | Frequency (%) |
| л | 48 | 11.6% |
| и | 41 | 9.9% |
| о | 37 | 8.9% |
| к | 30 | 7.2% |
| е | 27 | 6.5% |
| я | 25 | 6.0% |
| а | 17 | 4.1% |
| ц | 16 | 3.9% |
| К | 16 | 3.9% |
| р | 14 | 3.4% |
| Other values (32) | 143 |
None
| Value | Count | Frequency (%) |
| é | 45 | |
| ä | 40 | |
| ô | 35 | |
| ò | 28 | |
| ö | 19 | |
| ó | 14 | 5.7% |
| ı | 14 | 5.7% |
| í | 9 | 3.7% |
| á | 4 | 1.6% |
| İ | 4 | 1.6% |
| Other values (19) | 34 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 9 | |
| … | 3 | 21.4% |
| – | 2 | 14.3% |
Hiragana
| Value | Count | Frequency (%) |
| つ | 3 | |
| は | 3 | |
| よ | 3 | |
| ら | 3 | |
| い | 3 |
Katakana
| Value | Count | Frequency (%) |
| シ | 3 | |
| リ | 3 | |
| ー | 3 | |
| ズ | 3 |
CJK
| Value | Count | Frequency (%) |
| 男 | 3 |
Hangul
| Value | Count | Frequency (%) |
| 식 | 2 | |
| 객 | 2 | |
| 시 | 2 | |
| 리 | 2 | |
| 즈 | 2 |
budget
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1223 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4235014.7 |
| Minimum | 0 |
|---|---|
| Maximum | 3.8 × 108 |
| Zeros | 36477 |
| Zeros (%) | 80.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 25000000 |
| Maximum | 3.8 × 108 |
| Range | 3.8 × 108 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 17442463 |
|---|---|
| Coefficient of variation (CV) | 4.1186309 |
| Kurtosis | 66.604548 |
| Mean | 4235014.7 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.1164523 |
| Sum | 1.9212568 × 1011 |
| Variance | 3.0423951 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 36477 | |
| 5000000 | 286 | 0.6% |
| 10000000 | 259 | 0.6% |
| 20000000 | 243 | 0.5% |
| 2000000 | 242 | 0.5% |
| 15000000 | 226 | 0.5% |
| 3000000 | 223 | 0.5% |
| 25000000 | 206 | 0.5% |
| 1000000 | 197 | 0.4% |
| 30000000 | 192 | 0.4% |
| Other values (1213) | 6815 | 15.0% |
| Value | Count | Frequency (%) |
| 0 | 36477 | |
| 1 | 25 | 0.1% |
| 2 | 14 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 8 | < 0.1% |
| 5 | 8 | < 0.1% |
| 6 | 5 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 380000000 | 1 | < 0.1% |
| 300000000 | 1 | < 0.1% |
| 280000000 | 1 | < 0.1% |
| 270000000 | 1 | < 0.1% |
| 260000000 | 3 | < 0.1% |
| 258000000 | 1 | < 0.1% |
| 255000000 | 1 | < 0.1% |
| 250000000 | 10 | |
| 245000000 | 2 | < 0.1% |
| 237000000 | 1 | < 0.1% |
genres
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 4064 |
|---|---|
| Distinct (%) | 9.5% |
| Missing | 2383 |
| Missing (%) | 5.3% |
| Memory size | 354.5 KiB |
| Drama | |
|---|---|
| Comedy | |
| Documentary | 2713 |
| Drama, Romance | 1300 |
| Comedy, Drama | 1133 |
| Other values (4059) |
Length
| Max length | 80 |
|---|---|
| Median length | 65 |
| Mean length | 16.46188 |
| Min length | 3 |
Characters and Unicode
| Total characters | 707581 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2366 ? |
|---|---|
| Unique (%) | 5.5% |
Sample
| 1st row | Animation, Comedy, Family |
|---|---|
| 2nd row | Adventure, Fantasy, Family |
| 3rd row | Romance, Comedy |
| 4th row | Comedy, Drama, Romance |
| 5th row | Comedy |
Common Values
| Value | Count | Frequency (%) |
| Drama | 5001 | 11.0% |
| Comedy | 3620 | 8.0% |
| Documentary | 2713 | 6.0% |
| Drama, Romance | 1300 | 2.9% |
| Comedy, Drama | 1133 | 2.5% |
| Horror | 974 | 2.1% |
| Comedy, Romance | 930 | 2.0% |
| Comedy, Drama, Romance | 593 | 1.3% |
| Drama, Comedy | 531 | 1.2% |
| Horror, Thriller | 528 | 1.2% |
| Other values (4054) | 25660 | |
| (Missing) | 2383 | 5.3% |
Length
| Value | Count | Frequency (%) |
| drama | 20250 | |
| comedy | 13178 | |
| thriller | 7618 | 8.0% |
| romance | 6734 | 7.1% |
| action | 6590 | 7.0% |
| horror | 4669 | 4.9% |
| crime | 4306 | 4.5% |
| documentary | 3921 | 4.1% |
| adventure | 3493 | 3.7% |
| science | 3039 | 3.2% |
| Other values (12) | 21021 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 69055 | 9.8% |
| a | 61800 | 8.7% |
| e | 55751 | 7.9% |
| m | 53087 | 7.5% |
| 51836 | 7.3% | |
| o | 48511 | 6.9% |
| , | 48031 | 6.8% |
| i | 39638 | 5.6% |
| n | 35648 | 5.0% |
| y | 28500 | 4.0% |
| Other values (20) | 215724 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 512129 | |
| Uppercase Letter | 95585 | 13.5% |
| Space Separator | 51836 | 7.3% |
| Other Punctuation | 48031 | 6.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 69055 | |
| a | 61800 | |
| e | 55751 | |
| m | 53087 | |
| o | 48511 | |
| i | 39638 | |
| n | 35648 | |
| y | 28500 | |
| c | 27960 | |
| t | 26186 | 5.1% |
| Other values (7) | 65993 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 24171 | |
| C | 17484 | |
| A | 12013 | |
| F | 9737 | |
| T | 8384 | 8.8% |
| R | 6734 | 7.0% |
| H | 6066 | 6.3% |
| M | 4826 | 5.0% |
| S | 3039 | 3.2% |
| W | 2365 | 2.5% |
Space Separator
| Value | Count | Frequency (%) |
| 51836 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 48031 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 607714 | |
| Common | 99867 | 14.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 69055 | |
| a | 61800 | 10.2% |
| e | 55751 | 9.2% |
| m | 53087 | 8.7% |
| o | 48511 | 8.0% |
| i | 39638 | 6.5% |
| n | 35648 | 5.9% |
| y | 28500 | 4.7% |
| c | 27960 | 4.6% |
| t | 26186 | 4.3% |
| Other values (18) | 161578 |
Common
| Value | Count | Frequency (%) |
| 51836 | ||
| , | 48031 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 707581 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 69055 | 9.8% |
| a | 61800 | 8.7% |
| e | 55751 | 7.9% |
| m | 53087 | 7.5% |
| 51836 | 7.3% | |
| o | 48511 | 6.9% |
| , | 48031 | 6.8% |
| i | 39638 | 5.6% |
| n | 35648 | 5.0% |
| y | 28500 | 4.0% |
| Other values (20) | 215724 |
id
Real number (ℝ)
| Distinct | 45345 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 108023.61 |
| Minimum | 2 |
|---|---|
| Maximum | 469172 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 5334.25 |
| Q1 | 26387.25 |
| median | 59857.5 |
| Q3 | 156500.5 |
| 95-th percentile | 357148.75 |
| Maximum | 469172 |
| Range | 469170 |
| Interquartile range (IQR) | 130113.25 |
Descriptive statistics
| Standard deviation | 112165.81 |
|---|---|
| Coefficient of variation (CV) | 1.0383454 |
| Kurtosis | 0.55962819 |
| Mean | 108023.61 |
| Median Absolute Deviation (MAD) | 44418.5 |
| Skewness | 1.283115 |
| Sum | 4.9005989 × 109 |
| Variance | 1.2581169 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4912 | 4 | < 0.1% |
| 110428 | 4 | < 0.1% |
| 132641 | 4 | < 0.1% |
| 69234 | 2 | < 0.1% |
| 77221 | 2 | < 0.1% |
| 159849 | 2 | < 0.1% |
| 84198 | 2 | < 0.1% |
| 22649 | 2 | < 0.1% |
| 12600 | 2 | < 0.1% |
| 10991 | 2 | < 0.1% |
| Other values (45335) | 45340 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 469172 | 1 | |
| 468707 | 1 | |
| 468343 | 1 | |
| 467731 | 1 | |
| 465044 | 1 | |
| 464819 | 1 | |
| 464207 | 1 | |
| 464111 | 1 | |
| 463906 | 1 | |
| 463800 | 1 |
original_language
Categorical
HIGH CARDINALITY  IMBALANCE 
| Distinct | 89 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 11 |
| Missing (%) | < 0.1% |
| Memory size | 354.5 KiB |
| en | |
|---|---|
| fr | 2438 |
| it | 1528 |
| ja | 1351 |
| de | 1077 |
| Other values (84) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 90710 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 17 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
Common Values
| Value | Count | Frequency (%) |
| en | 32196 | |
| fr | 2438 | 5.4% |
| it | 1528 | 3.4% |
| ja | 1351 | 3.0% |
| de | 1077 | 2.4% |
| es | 991 | 2.2% |
| ru | 822 | 1.8% |
| hi | 508 | 1.1% |
| ko | 444 | 1.0% |
| zh | 408 | 0.9% |
| Other values (79) | 3592 | 7.9% |
Length
| Value | Count | Frequency (%) |
| en | 32196 | |
| fr | 2438 | 5.4% |
| it | 1528 | 3.4% |
| ja | 1351 | 3.0% |
| de | 1077 | 2.4% |
| es | 991 | 2.2% |
| ru | 822 | 1.8% |
| hi | 508 | 1.1% |
| ko | 444 | 1.0% |
| zh | 408 | 0.9% |
| Other values (79) | 3592 | 7.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 34519 | |
| n | 32904 | |
| r | 3631 | 4.0% |
| f | 2833 | 3.1% |
| i | 2386 | 2.6% |
| t | 2250 | 2.5% |
| a | 1839 | 2.0% |
| s | 1650 | 1.8% |
| j | 1352 | 1.5% |
| d | 1321 | 1.5% |
| Other values (16) | 6025 | 6.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 90710 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 34519 | |
| n | 32904 | |
| r | 3631 | 4.0% |
| f | 2833 | 3.1% |
| i | 2386 | 2.6% |
| t | 2250 | 2.5% |
| a | 1839 | 2.0% |
| s | 1650 | 1.8% |
| j | 1352 | 1.5% |
| d | 1321 | 1.5% |
| Other values (16) | 6025 | 6.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 90710 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 34519 | |
| n | 32904 | |
| r | 3631 | 4.0% |
| f | 2833 | 3.1% |
| i | 2386 | 2.6% |
| t | 2250 | 2.5% |
| a | 1839 | 2.0% |
| s | 1650 | 1.8% |
| j | 1352 | 1.5% |
| d | 1321 | 1.5% |
| Other values (16) | 6025 | 6.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 90710 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 34519 | |
| n | 32904 | |
| r | 3631 | 4.0% |
| f | 2833 | 3.1% |
| i | 2386 | 2.6% |
| t | 2250 | 2.5% |
| a | 1839 | 2.0% |
| s | 1650 | 1.8% |
| j | 1352 | 1.5% |
| d | 1321 | 1.5% |
| Other values (16) | 6025 | 6.6% |
overview
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 44231 |
|---|---|
| Distinct (%) | 99.6% |
| Missing | 941 |
| Missing (%) | 2.1% |
| Memory size | 354.5 KiB |
| No overview found. | 133 |
|---|---|
| No Overview | 7 |
| 5 | |
| Winter, 1915. Confined by her family to an asylum in the South of France - where she will never sculpt again - the chronicle of Camille Claudel's reclusive life, as she waits for a visit from her brother, Paul Claudel. | 4 |
| Ten years into a marriage, the wife is disappointed by the husband's lack of financial success, meaning she has to work and can't treat herself and the husband finds the wife slovenly and mean-spirited: she neither cooks not cleans particularly well and is generally disagreeable. In turn, he alternately ignores her and treats her as a servant. Neither is particularly happy, not helped by their unsatisfactory lodgers. The husband is easily seduced by an ex-colleague, a widow with a small child who needs some security, and considers leaving his wife. | 4 |
| Other values (44226) |
Length
| Max length | 1000 |
|---|---|
| Median length | 786 |
| Mean length | 323.29738 |
| Min length | 1 |
Characters and Unicode
| Total characters | 14362486 |
|---|---|
| Distinct characters | 429 |
| Distinct categories | 25 ? |
| Distinct scripts | 13 ? |
| Distinct blocks | 21 ? |
Unique
| Unique | 44185 ? |
|---|---|
| Unique (%) | 99.5% |
Sample
| 1st row | Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences. |
|---|---|
| 2nd row | When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures. |
| 3rd row | A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max. |
| 4th row | Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe. |
| 5th row | Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own. |
Common Values
| Value | Count | Frequency (%) |
| No overview found. | 133 | 0.3% |
| No Overview | 7 | < 0.1% |
| 5 | < 0.1% | |
| Winter, 1915. Confined by her family to an asylum in the South of France - where she will never sculpt again - the chronicle of Camille Claudel's reclusive life, as she waits for a visit from her brother, Paul Claudel. | 4 | < 0.1% |
| Ten years into a marriage, the wife is disappointed by the husband's lack of financial success, meaning she has to work and can't treat herself and the husband finds the wife slovenly and mean-spirited: she neither cooks not cleans particularly well and is generally disagreeable. In turn, he alternately ignores her and treats her as a servant. Neither is particularly happy, not helped by their unsatisfactory lodgers. The husband is easily seduced by an ex-colleague, a widow with a small child who needs some security, and considers leaving his wife. | 4 | < 0.1% |
| Television made him famous, but his biggest hits happened off screen. Television producer by day, CIA assassin by night, Chuck Barris was recruited by the CIA at the height of his TV career and trained to become a covert operative. Or so Barris said. | 4 | < 0.1% |
| No movie overview available. | 3 | < 0.1% |
| A few funny little novels about different aspects of life. | 3 | < 0.1% |
| Adaptation of the Jane Austen novel. | 3 | < 0.1% |
| A group of travelers, including a monk, stay in a lonely inn in the mountains. The host confesses the monk his habit of serving poisoned soup to the guests, to rob their possessions and to bury them in the backyard. The story unfolds as the monk tries to save the guest's lives without violating the holy secrecy of the confession. | 2 | < 0.1% |
| Other values (44221) | 44257 | |
| (Missing) | 941 | 2.1% |
Length
| Value | Count | Frequency (%) |
| the | 138054 | 5.6% |
| a | 98868 | 4.0% |
| and | 75243 | 3.1% |
| to | 73297 | 3.0% |
| of | 69558 | 2.8% |
| in | 48132 | 2.0% |
| is | 36498 | 1.5% |
| his | 36152 | 1.5% |
| with | 23893 | 1.0% |
| her | 21478 | 0.9% |
| Other values (97091) | 1826995 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2405820 | ||
| e | 1363533 | 9.5% |
| a | 940287 | 6.5% |
| t | 934520 | 6.5% |
| i | 851319 | 5.9% |
| o | 829629 | 5.8% |
| n | 822396 | 5.7% |
| s | 767677 | 5.3% |
| r | 744084 | 5.2% |
| h | 600669 | 4.2% |
| Other values (419) | 4102552 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11147505 | |
| Space Separator | 2405858 | 16.8% |
| Uppercase Letter | 390904 | 2.7% |
| Other Punctuation | 312765 | 2.2% |
| Decimal Number | 42221 | 0.3% |
| Dash Punctuation | 36763 | 0.3% |
| Close Punctuation | 10097 | 0.1% |
| Open Punctuation | 10074 | 0.1% |
| Final Punctuation | 4553 | < 0.1% |
| Initial Punctuation | 881 | < 0.1% |
| Other values (15) | 865 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1363533 | |
| a | 940287 | 8.4% |
| t | 934520 | 8.4% |
| i | 851319 | 7.6% |
| o | 829629 | 7.4% |
| n | 822396 | 7.4% |
| s | 767677 | 6.9% |
| r | 744084 | 6.7% |
| h | 600669 | 5.4% |
| l | 478724 | 4.3% |
| Other values (142) | 2814667 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 42750 | 10.9% |
| T | 35968 | 9.2% |
| S | 31119 | 8.0% |
| M | 23947 | 6.1% |
| B | 23696 | 6.1% |
| C | 22812 | 5.8% |
| H | 19425 | 5.0% |
| W | 18646 | 4.8% |
| I | 16796 | 4.3% |
| D | 16309 | 4.2% |
| Other values (77) | 139436 |
Other Letter
| Value | Count | Frequency (%) |
| न | 6 | 4.8% |
| र | 6 | 4.8% |
| म | 5 | 4.0% |
| の | 4 | 3.2% |
| ద | 3 | 2.4% |
| प | 3 | 2.4% |
| द | 3 | 2.4% |
| अ | 3 | 2.4% |
| م | 2 | 1.6% |
| व | 2 | 1.6% |
| Other values (76) | 88 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 133411 | |
| . | 124771 | |
| ' | 31118 | 9.9% |
| " | 11661 | 3.7% |
| : | 3298 | 1.1% |
| ? | 2759 | 0.9% |
| ; | 2493 | 0.8% |
| ! | 1543 | 0.5% |
| / | 765 | 0.2% |
| & | 453 | 0.1% |
| Other values (12) | 493 | 0.2% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ి | 4 | |
| ̈ | 3 | |
| ్ | 3 | |
| ் | 3 | |
| ् | 3 | |
| ా | 2 | 6.1% |
| े | 2 | 6.1% |
| ं | 2 | 6.1% |
| ु | 2 | 6.1% |
| Other values (4) | 5 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 9748 | |
| 0 | 8265 | |
| 9 | 6406 | |
| 2 | 4250 | |
| 5 | 2442 | 5.8% |
| 8 | 2378 | 5.6% |
| 3 | 2340 | 5.5% |
| 4 | 2176 | 5.2% |
| 7 | 2131 | 5.0% |
| 6 | 2085 | 4.9% |
Spacing Mark
| Value | Count | Frequency (%) |
| ा | 11 | |
| ी | 4 | 14.8% |
| ो | 3 | 11.1% |
| ు | 3 | 11.1% |
| ि | 2 | 7.4% |
| ு | 2 | 7.4% |
| ం | 1 | 3.7% |
| ி | 1 | 3.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 35240 | |
| – | 881 | 2.4% |
| — | 633 | 1.7% |
| ― | 5 | < 0.1% |
| ‐ | 4 | < 0.1% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 45 | |
| ™ | 14 | 21.9% |
| ¦ | 2 | 3.1% |
| ° | 2 | 3.1% |
| � | 1 | 1.6% |
Math Symbol
| Value | Count | Frequency (%) |
| ~ | 20 | |
| + | 11 | |
| = | 6 | 15.0% |
| | | 2 | 5.0% |
| − | 1 | 2.5% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 10021 | |
| [ | 50 | 0.5% |
| { | 2 | < 0.1% |
| „ | 1 | < 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 317 | |
| £ | 10 | 3.0% |
| ₹ | 1 | 0.3% |
| € | 1 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2405820 | ||
| 36 | < 0.1% | |
| 2 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 10045 | |
| ] | 50 | 0.5% |
| } | 2 | < 0.1% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 3845 | |
| ” | 689 | 15.1% |
| » | 19 | 0.4% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 671 | |
| ‘ | 192 | 21.8% |
| « | 18 | 2.0% |
Control
| Value | Count | Frequency (%) |
| 106 | ||
| | 3 | 2.7% |
| | 1 | 0.9% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 25 | |
| ` | 12 | |
| ¯ | 1 | 2.6% |
Format
| Value | Count | Frequency (%) |
| | 31 | |
| | 20 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 8 | |
| ¹ | 8 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 19 |
Line Separator
| Value | Count | Frequency (%) |
| 7 |
Letter Number
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
Paragraph Separator
| Value | Count | Frequency (%) |
| 2 |
Modifier Letter
| Value | Count | Frequency (%) |
| ʼ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11533177 | |
| Common | 2823890 | 19.7% |
| Cyrillic | 4587 | < 0.1% |
| Greek | 648 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Han | 10 | < 0.1% |
| Hangul | 9 | < 0.1% |
| Other values (3) | 19 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1363533 | |
| a | 940287 | 8.2% |
| t | 934520 | 8.1% |
| i | 851319 | 7.4% |
| o | 829629 | 7.2% |
| n | 822396 | 7.1% |
| s | 767677 | 6.7% |
| r | 744084 | 6.5% |
| h | 600669 | 5.2% |
| l | 478724 | 4.2% |
| Other values (132) | 3200339 |
Common
| Value | Count | Frequency (%) |
| 2405820 | ||
| , | 133411 | 4.7% |
| . | 124771 | 4.4% |
| - | 35240 | 1.2% |
| ' | 31118 | 1.1% |
| " | 11661 | 0.4% |
| ) | 10045 | 0.4% |
| ( | 10021 | 0.4% |
| 1 | 9748 | 0.3% |
| 0 | 8265 | 0.3% |
| Other values (71) | 43790 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Greek
| Value | Count | Frequency (%) |
| α | 60 | 9.3% |
| ο | 55 | 8.5% |
| τ | 43 | 6.6% |
| ι | 36 | 5.6% |
| η | 36 | 5.6% |
| ν | 34 | 5.2% |
| ρ | 31 | 4.8% |
| ε | 31 | 4.8% |
| ς | 30 | 4.6% |
| π | 30 | 4.6% |
| Other values (33) | 262 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| न | 6 | 7.8% |
| र | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| ो | 3 | 3.9% |
| प | 3 | 3.9% |
| द | 3 | 3.9% |
| अ | 3 | 3.9% |
| ् | 3 | 3.9% |
| Other values (21) | 30 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| さ | 1 | 5.0% |
| ん | 1 | 5.0% |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| ず | 1 | 5.0% |
| め | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ち | 1 | 5.0% |
| か | 1 | 5.0% |
| Other values (7) | 7 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ు | 3 | |
| ద | 3 | |
| ్ | 3 | |
| స | 2 | 6.7% |
| ా | 2 | 6.7% |
| మ | 2 | 6.7% |
| న | 2 | 6.7% |
| ర | 2 | 6.7% |
| హ | 1 | 3.3% |
| Other values (6) | 6 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ம | 2 | |
| ர | 2 | |
| ப | 2 | |
| ு | 2 | |
| ன | 1 | 5.3% |
| வ | 1 | 5.3% |
| த | 1 | 5.3% |
| ஆ | 1 | 5.3% |
| ய | 1 | 5.3% |
| Other values (3) | 3 |
Han
| Value | Count | Frequency (%) |
| 界 | 1 | |
| 俣 | 1 | |
| 患 | 1 | |
| 者 | 1 | |
| 世 | 1 | |
| 水 | 1 | |
| 鬼 | 1 | |
| 見 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ง | 1 | |
| ร | 1 | |
| พ | 1 | |
| แ | 1 | |
| ี | 1 | |
| ส | 1 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Inherited
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14344494 | |
| Punctuation | 7266 | 0.1% |
| None | 5928 | < 0.1% |
| Cyrillic | 4587 | < 0.1% |
| Devanagari | 77 | < 0.1% |
| Telugu | 30 | < 0.1% |
| Hiragana | 20 | < 0.1% |
| Tamil | 19 | < 0.1% |
| Letterlike Symbols | 14 | < 0.1% |
| CJK | 10 | < 0.1% |
| Other values (11) | 41 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2405820 | ||
| e | 1363533 | 9.5% |
| a | 940287 | 6.6% |
| t | 934520 | 6.5% |
| i | 851319 | 5.9% |
| o | 829629 | 5.8% |
| n | 822396 | 5.7% |
| s | 767677 | 5.4% |
| r | 744084 | 5.2% |
| h | 600669 | 4.2% |
| Other values (82) | 4084560 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 3845 | |
| – | 881 | 12.1% |
| ” | 689 | 9.5% |
| “ | 671 | 9.2% |
| — | 633 | 8.7% |
| … | 303 | 4.2% |
| ‘ | 192 | 2.6% |
| | 31 | 0.4% |
| 7 | 0.1% | |
| ― | 5 | 0.1% |
| Other values (4) | 9 | 0.1% |
None
| Value | Count | Frequency (%) |
| é | 1550 | |
| ä | 294 | 5.0% |
| á | 293 | 4.9% |
| ö | 250 | 4.2% |
| í | 243 | 4.1% |
| è | 209 | 3.5% |
| ü | 178 | 3.0% |
| ı | 165 | 2.8% |
| ó | 164 | 2.8% |
| ç | 158 | 2.7% |
| Other values (141) | 2424 |
Cyrillic
| Value | Count | Frequency (%) |
| о | 470 | 10.2% |
| е | 404 | 8.8% |
| а | 373 | 8.1% |
| н | 323 | 7.0% |
| и | 299 | 6.5% |
| т | 265 | 5.8% |
| р | 240 | 5.2% |
| с | 218 | 4.8% |
| в | 173 | 3.8% |
| л | 161 | 3.5% |
| Other values (46) | 1661 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 14 |
Devanagari
| Value | Count | Frequency (%) |
| ा | 11 | 14.3% |
| न | 6 | 7.8% |
| र | 6 | 7.8% |
| म | 5 | 6.5% |
| ी | 4 | 5.2% |
| ो | 3 | 3.9% |
| प | 3 | 3.9% |
| द | 3 | 3.9% |
| अ | 3 | 3.9% |
| ् | 3 | 3.9% |
| Other values (21) | 30 |
Alphabetic PF
| Value | Count | Frequency (%) |
| fi | 4 |
Hiragana
| Value | Count | Frequency (%) |
| の | 4 | |
| さ | 1 | 5.0% |
| ん | 1 | 5.0% |
| と | 1 | 5.0% |
| そ | 1 | 5.0% |
| ず | 1 | 5.0% |
| め | 1 | 5.0% |
| ひ | 1 | 5.0% |
| ち | 1 | 5.0% |
| か | 1 | 5.0% |
| Other values (7) | 7 |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 4 | |
| ̈ | 3 |
Telugu
| Value | Count | Frequency (%) |
| ి | 4 | |
| ు | 3 | |
| ద | 3 | |
| ్ | 3 | |
| స | 2 | 6.7% |
| ా | 2 | 6.7% |
| మ | 2 | 6.7% |
| న | 2 | 6.7% |
| ర | 2 | 6.7% |
| హ | 1 | 3.3% |
| Other values (6) | 6 |
Tamil
| Value | Count | Frequency (%) |
| ் | 3 | |
| ம | 2 | |
| ர | 2 | |
| ப | 2 | |
| ு | 2 | |
| ன | 1 | 5.3% |
| வ | 1 | 5.3% |
| த | 1 | 5.3% |
| ஆ | 1 | 5.3% |
| ய | 1 | 5.3% |
| Other values (3) | 3 |
Arabic
| Value | Count | Frequency (%) |
| م | 2 | |
| ہ | 1 | |
| ت | 1 |
Hangul
| Value | Count | Frequency (%) |
| 사 | 2 | |
| 회 | 1 | |
| 식 | 1 | |
| 주 | 1 | |
| 기 | 1 | |
| 찾 | 1 | |
| 랑 | 1 | |
| 첫 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| Ⅱ | 2 |
Thai
| Value | Count | Frequency (%) |
| ่ | 2 | |
| ง | 1 | |
| ร | 1 | |
| พ | 1 | |
| แ | 1 | |
| ี | 1 | |
| ส | 1 |
Modifier Letters
| Value | Count | Frequency (%) |
| ʼ | 2 |
CJK
| Value | Count | Frequency (%) |
| 界 | 1 | |
| 俣 | 1 | |
| 患 | 1 | |
| 者 | 1 | |
| 世 | 1 | |
| 水 | 1 | |
| 鬼 | 1 | |
| 見 | 1 | |
| 難 | 1 | |
| 海 | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
Katakana
| Value | Count | Frequency (%) |
| ・ | 1 |
Currency Symbols
| Value | Count | Frequency (%) |
| ₹ | 1 | |
| € | 1 |
Specials
| Value | Count | Frequency (%) |
| � | 1 |
popularity
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 2017 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.9264705 |
| Minimum | 0 |
|---|---|
| Maximum | 547.49 |
| Zeros | 1428 |
| Zeros (%) | 3.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.02 |
| Q1 | 0.39 |
| median | 1.13 |
| Q3 | 3.69 |
| 95-th percentile | 11.06 |
| Maximum | 547.49 |
| Range | 547.49 |
| Interquartile range (IQR) | 3.3 |
Descriptive statistics
| Standard deviation | 6.0101494 |
|---|---|
| Coefficient of variation (CV) | 2.0537194 |
| Kurtosis | 1923.5152 |
| Mean | 2.9264705 |
| Median Absolute Deviation (MAD) | 0.97 |
| Skewness | 29.214569 |
| Sum | 132762.26 |
| Variance | 36.121895 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1428 | 3.1% |
| 0.01 | 651 | 1.4% |
| 0.04 | 591 | 1.3% |
| 0.08 | 411 | 0.9% |
| 0.11 | 409 | 0.9% |
| 0.15 | 381 | 0.8% |
| 0.07 | 349 | 0.8% |
| 0.05 | 337 | 0.7% |
| 0.12 | 319 | 0.7% |
| 0.02 | 292 | 0.6% |
| Other values (2007) | 40198 |
| Value | Count | Frequency (%) |
| 0 | 1428 | |
| 0.01 | 651 | |
| 0.02 | 292 | 0.6% |
| 0.03 | 164 | 0.4% |
| 0.04 | 591 | |
| 0.05 | 337 | 0.7% |
| 0.06 | 281 | 0.6% |
| 0.07 | 349 | 0.8% |
| 0.08 | 411 | 0.9% |
| 0.09 | 257 | 0.6% |
| Value | Count | Frequency (%) |
| 547.49 | 1 | |
| 294.34 | 1 | |
| 287.25 | 1 | |
| 228.03 | 1 | |
| 213.85 | 1 | |
| 187.86 | 1 | |
| 185.33 | 1 | |
| 185.07 | 1 | |
| 183.87 | 1 | |
| 154.8 | 1 |
production_companies
Categorical
HIGH CARDINALITY  MISSING 
| Distinct | 22666 |
|---|---|
| Distinct (%) | 67.5% |
| Missing | 11792 |
| Missing (%) | 26.0% |
| Memory size | 354.5 KiB |
| Metro-Goldwyn-Mayer (MGM) | 742 |
|---|---|
| Warner Bros. | 540 |
| Paramount Pictures | 505 |
| Twentieth Century Fox Film Corporation | 439 |
| Universal Pictures | 320 |
| Other values (22661) |
Length
| Max length | 609 |
|---|---|
| Median length | 412 |
| Mean length | 41.490886 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1393015 |
|---|---|
| Distinct characters | 294 |
| Distinct categories | 17 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 6 ? |
Unique
| Unique | 20309 ? |
|---|---|
| Unique (%) | 60.5% |
Sample
| 1st row | Pixar Animation Studios |
|---|---|
| 2nd row | TriStar Pictures, Teitler Film, Interscope Communications |
| 3rd row | Warner Bros., Lancaster Gate |
| 4th row | Twentieth Century Fox Film Corporation |
| 5th row | Sandollar Productions, Touchstone Pictures |
Common Values
| Value | Count | Frequency (%) |
| Metro-Goldwyn-Mayer (MGM) | 742 | 1.6% |
| Warner Bros. | 540 | 1.2% |
| Paramount Pictures | 505 | 1.1% |
| Twentieth Century Fox Film Corporation | 439 | 1.0% |
| Universal Pictures | 320 | 0.7% |
| RKO Radio Pictures | 247 | 0.5% |
| Columbia Pictures Corporation | 207 | 0.5% |
| Columbia Pictures | 146 | 0.3% |
| Mosfilm | 145 | 0.3% |
| Walt Disney Pictures | 85 | 0.2% |
| Other values (22656) | 30198 | |
| (Missing) | 11792 | 26.0% |
Length
| Value | Count | Frequency (%) |
| films | 9456 | 5.3% |
| pictures | 9266 | 5.2% |
| productions | 9058 | 5.1% |
| film | 6672 | 3.8% |
| entertainment | 5153 | 2.9% |
| corporation | 2189 | 1.2% |
| company | 1770 | 1.0% |
| warner | 1478 | 0.8% |
| bros | 1411 | 0.8% |
| the | 1381 | 0.8% |
| Other values (18616) | 129810 |
Most occurring characters
| Value | Count | Frequency (%) |
| 144079 | 10.3% | |
| i | 106901 | 7.7% |
| e | 94614 | 6.8% |
| n | 89944 | 6.5% |
| o | 85269 | 6.1% |
| r | 83528 | 6.0% |
| t | 83401 | 6.0% |
| a | 77130 | 5.5% |
| s | 62653 | 4.5% |
| l | 51244 | 3.7% |
| Other values (284) | 514252 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 986721 | |
| Uppercase Letter | 198935 | 14.3% |
| Space Separator | 144084 | 10.3% |
| Other Punctuation | 45100 | 3.2% |
| Decimal Number | 4349 | 0.3% |
| Dash Punctuation | 4329 | 0.3% |
| Open Punctuation | 4325 | 0.3% |
| Close Punctuation | 4324 | 0.3% |
| Math Symbol | 663 | < 0.1% |
| Other Letter | 140 | < 0.1% |
| Other values (7) | 45 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 106901 | |
| e | 94614 | |
| n | 89944 | |
| o | 85269 | |
| r | 83528 | |
| t | 83401 | |
| a | 77130 | 7.8% |
| s | 62653 | 6.3% |
| l | 51244 | 5.2% |
| m | 44264 | 4.5% |
| Other values (102) | 207773 |
Other Letter
| Value | Count | Frequency (%) |
| 스 | 9 | 6.4% |
| 트 | 8 | 5.7% |
| 인 | 6 | 4.3% |
| 주 | 5 | 3.6% |
| 터 | 5 | 3.6% |
| 먼 | 5 | 3.6% |
| 테 | 5 | 3.6% |
| 엔 | 5 | 3.6% |
| 픽 | 4 | 2.9% |
| 이 | 3 | 2.1% |
| Other values (62) | 85 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 27879 | |
| F | 26351 | |
| C | 20583 | 10.3% |
| M | 13359 | 6.7% |
| S | 11908 | 6.0% |
| E | 9744 | 4.9% |
| A | 9554 | 4.8% |
| T | 9355 | 4.7% |
| B | 9000 | 4.5% |
| G | 7806 | 3.9% |
| Other values (52) | 53396 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 37346 | |
| . | 5681 | 12.6% |
| & | 764 | 1.7% |
| / | 644 | 1.4% |
| ' | 451 | 1.0% |
| " | 133 | 0.3% |
| ! | 36 | 0.1% |
| % | 18 | < 0.1% |
| : | 9 | < 0.1% |
| @ | 5 | < 0.1% |
| Other values (6) | 13 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1034 | |
| 1 | 712 | |
| 0 | 641 | |
| 3 | 558 | |
| 4 | 481 | |
| 9 | 205 | 4.7% |
| 6 | 195 | 4.5% |
| 5 | 178 | 4.1% |
| 8 | 173 | 4.0% |
| 7 | 172 | 4.0% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 4315 | |
| [ | 9 | 0.2% |
| ( | 1 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 4314 | |
| ] | 9 | 0.2% |
| ) | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 144079 | ||
| 5 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4327 | |
| – | 2 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 662 | |
| | | 1 | 0.2% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 23 | |
| ㈜ | 2 | 8.0% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 3 | |
| » | 3 |
Other Number
| Value | Count | Frequency (%) |
| ² | 1 | |
| ½ | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 4 |
Control
| Value | Count | Frequency (%) |
| 4 |
Initial Punctuation
| Value | Count | Frequency (%) |
| « | 3 |
Format
| Value | Count | Frequency (%) |
| | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1185253 | |
| Common | 207217 | 14.9% |
| Cyrillic | 373 | < 0.1% |
| Hangul | 115 | < 0.1% |
| Greek | 31 | < 0.1% |
| Han | 26 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 106901 | 9.0% |
| e | 94614 | 8.0% |
| n | 89944 | 7.6% |
| o | 85269 | 7.2% |
| r | 83528 | 7.0% |
| t | 83401 | 7.0% |
| a | 77130 | 6.5% |
| s | 62653 | 5.3% |
| l | 51244 | 4.3% |
| m | 44264 | 3.7% |
| Other values (99) | 406305 |
Hangul
| Value | Count | Frequency (%) |
| 스 | 9 | 7.8% |
| 트 | 8 | 7.0% |
| 인 | 6 | 5.2% |
| 주 | 5 | 4.3% |
| 터 | 5 | 4.3% |
| 먼 | 5 | 4.3% |
| 테 | 5 | 4.3% |
| 엔 | 5 | 4.3% |
| 픽 | 4 | 3.5% |
| 이 | 3 | 2.6% |
| Other values (43) | 60 |
Common
| Value | Count | Frequency (%) |
| 144079 | ||
| , | 37346 | 18.0% |
| . | 5681 | 2.7% |
| - | 4327 | 2.1% |
| ( | 4315 | 2.1% |
| ) | 4314 | 2.1% |
| 2 | 1034 | 0.5% |
| & | 764 | 0.4% |
| 1 | 712 | 0.3% |
| + | 662 | 0.3% |
| Other values (37) | 3983 | 1.9% |
Cyrillic
| Value | Count | Frequency (%) |
| и | 34 | 9.1% |
| о | 28 | 7.5% |
| а | 26 | 7.0% |
| л | 22 | 5.9% |
| н | 20 | 5.4% |
| м | 19 | 5.1% |
| т | 17 | 4.6% |
| е | 16 | 4.3% |
| с | 16 | 4.3% |
| ь | 16 | 4.3% |
| Other values (36) | 159 |
Greek
| Value | Count | Frequency (%) |
| ν | 3 | 9.7% |
| ο | 3 | 9.7% |
| Κ | 2 | 6.5% |
| ρ | 2 | 6.5% |
| τ | 2 | 6.5% |
| η | 2 | 6.5% |
| Ε | 2 | 6.5% |
| λ | 2 | 6.5% |
| ι | 2 | 6.5% |
| έ | 1 | 3.2% |
| Other values (10) | 10 |
Han
| Value | Count | Frequency (%) |
| 影 | 2 | 7.7% |
| 司 | 2 | 7.7% |
| 公 | 2 | 7.7% |
| 有 | 2 | 7.7% |
| 限 | 2 | 7.7% |
| 北 | 2 | 7.7% |
| 京 | 2 | 7.7% |
| 发 | 1 | 3.8% |
| 行 | 1 | 3.8% |
| 电 | 1 | 3.8% |
| Other values (9) | 9 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1386786 | |
| None | 5710 | 0.4% |
| Cyrillic | 373 | < 0.1% |
| Hangul | 113 | < 0.1% |
| CJK | 26 | < 0.1% |
| Punctuation | 7 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 144079 | 10.4% | |
| i | 106901 | 7.7% |
| e | 94614 | 6.8% |
| n | 89944 | 6.5% |
| o | 85269 | 6.1% |
| r | 83528 | 6.0% |
| t | 83401 | 6.0% |
| a | 77130 | 5.6% |
| s | 62653 | 4.5% |
| l | 51244 | 3.7% |
| Other values (77) | 508023 |
None
| Value | Count | Frequency (%) |
| é | 3176 | |
| ó | 416 | 7.3% |
| á | 317 | 5.6% |
| í | 173 | 3.0% |
| ü | 154 | 2.7% |
| ñ | 150 | 2.6% |
| ô | 140 | 2.5% |
| è | 136 | 2.4% |
| ä | 136 | 2.4% |
| ö | 132 | 2.3% |
| Other values (76) | 780 | 13.7% |
Cyrillic
| Value | Count | Frequency (%) |
| и | 34 | 9.1% |
| о | 28 | 7.5% |
| а | 26 | 7.0% |
| л | 22 | 5.9% |
| н | 20 | 5.4% |
| м | 19 | 5.1% |
| т | 17 | 4.6% |
| е | 16 | 4.3% |
| с | 16 | 4.3% |
| ь | 16 | 4.3% |
| Other values (36) | 159 |
Hangul
| Value | Count | Frequency (%) |
| 스 | 9 | 8.0% |
| 트 | 8 | 7.1% |
| 인 | 6 | 5.3% |
| 주 | 5 | 4.4% |
| 터 | 5 | 4.4% |
| 먼 | 5 | 4.4% |
| 테 | 5 | 4.4% |
| 엔 | 5 | 4.4% |
| 픽 | 4 | 3.5% |
| 이 | 3 | 2.7% |
| Other values (42) | 58 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 3 | |
| – | 2 | |
| • | 1 | 14.3% |
| | 1 | 14.3% |
CJK
| Value | Count | Frequency (%) |
| 影 | 2 | 7.7% |
| 司 | 2 | 7.7% |
| 公 | 2 | 7.7% |
| 有 | 2 | 7.7% |
| 限 | 2 | 7.7% |
| 北 | 2 | 7.7% |
| 京 | 2 | 7.7% |
| 发 | 1 | 3.8% |
| 行 | 1 | 3.8% |
| 电 | 1 | 3.8% |
| Other values (9) | 9 |
production_countries
Categorical
HIGH CARDINALITY  IMBALANCE  MISSING 
| Distinct | 2388 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 6208 |
| Missing (%) | 13.7% |
| Memory size | 354.5 KiB |
| United States of America | |
|---|---|
| United Kingdom | |
| France | 1655 |
| Japan | 1358 |
| Italy | 1029 |
| Other values (2383) |
Length
| Max length | 237 |
|---|---|
| Median length | 167 |
| Mean length | 19.045355 |
| Min length | 4 |
Characters and Unicode
| Total characters | 745778 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1767 ? |
|---|---|
| Unique (%) | 4.5% |
Sample
| 1st row | United States of America |
|---|---|
| 2nd row | United States of America |
| 3rd row | United States of America |
| 4th row | United States of America |
| 5th row | United States of America |
Common Values
| Value | Count | Frequency (%) |
| United States of America | 17845 | |
| United Kingdom | 2235 | 4.9% |
| France | 1655 | 3.6% |
| Japan | 1358 | 3.0% |
| Italy | 1029 | 2.3% |
| Canada | 840 | 1.9% |
| Germany | 748 | 1.6% |
| India | 735 | 1.6% |
| Russia | 734 | 1.6% |
| United Kingdom, United States of America | 569 | 1.3% |
| Other values (2378) | 11410 | |
| (Missing) | 6208 | 13.7% |
Length
| Value | Count | Frequency (%) |
| united | 25263 | |
| states | 21147 | |
| of | 21146 | |
| america | 21146 | |
| kingdom | 4089 | 3.4% |
| france | 3937 | 3.3% |
| germany | 2257 | 1.9% |
| italy | 2167 | 1.8% |
| canada | 1765 | 1.5% |
| japan | 1650 | 1.4% |
| Other values (177) | 14162 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 80630 | 10.8% |
| 79571 | 10.7% | |
| t | 72613 | 9.7% |
| a | 70475 | 9.4% |
| i | 58538 | 7.8% |
| n | 47476 | 6.4% |
| d | 34534 | 4.6% |
| r | 32478 | 4.4% |
| o | 29574 | 4.0% |
| m | 28694 | 3.8% |
| Other values (43) | 211195 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 558427 | |
| Uppercase Letter | 97545 | 13.1% |
| Space Separator | 79571 | 10.7% |
| Other Punctuation | 10235 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 80630 | |
| t | 72613 | |
| a | 70475 | |
| i | 58538 | |
| n | 47476 | |
| d | 34534 | |
| r | 32478 | |
| o | 29574 | 5.3% |
| m | 28694 | 5.1% |
| c | 26368 | 4.7% |
| Other values (16) | 77047 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 25364 | |
| S | 23833 | |
| A | 22388 | |
| K | 5216 | 5.3% |
| F | 4330 | 4.4% |
| I | 3581 | 3.7% |
| C | 2594 | 2.7% |
| G | 2470 | 2.5% |
| J | 1666 | 1.7% |
| R | 1307 | 1.3% |
| Other values (14) | 4796 | 4.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 10230 | |
| ' | 5 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 79571 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 655972 | |
| Common | 89806 | 12.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 80630 | |
| t | 72613 | |
| a | 70475 | |
| i | 58538 | 8.9% |
| n | 47476 | 7.2% |
| d | 34534 | 5.3% |
| r | 32478 | 5.0% |
| o | 29574 | 4.5% |
| m | 28694 | 4.4% |
| c | 26368 | 4.0% |
| Other values (40) | 174592 |
Common
| Value | Count | Frequency (%) |
| 79571 | ||
| , | 10230 | 11.4% |
| ' | 5 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 745778 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 80630 | 10.8% |
| 79571 | 10.7% | |
| t | 72613 | 9.7% |
| a | 70475 | 9.4% |
| i | 58538 | 7.8% |
| n | 47476 | 6.4% |
| d | 34534 | 4.6% |
| r | 32478 | 4.4% |
| o | 29574 | 4.0% |
| m | 28694 | 3.8% |
| Other values (43) | 211195 |
release_date
Date
| Distinct | 17333 |
|---|---|
| Distinct (%) | 38.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 354.5 KiB |
| Minimum | 1874-12-09 00:00:00 |
|---|---|
| Maximum | 2020-12-16 00:00:00 |
revenue
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 6863 |
|---|---|
| Distinct (%) | 15.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11233994 |
| Minimum | 0 |
|---|---|
| Maximum | 2.7879651 × 109 |
| Zeros | 37958 |
| Zeros (%) | 83.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 48025328 |
| Maximum | 2.7879651 × 109 |
| Range | 2.7879651 × 109 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 64396963 |
|---|---|
| Coefficient of variation (CV) | 5.73233 |
| Kurtosis | 237.0229 |
| Mean | 11233994 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 12.253245 |
| Sum | 5.0964139 × 1011 |
| Variance | 4.1469688 × 1015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 37958 | |
| 12000000 | 20 | < 0.1% |
| 10000000 | 19 | < 0.1% |
| 11000000 | 19 | < 0.1% |
| 2000000 | 18 | < 0.1% |
| 6000000 | 17 | < 0.1% |
| 5000000 | 14 | < 0.1% |
| 8000000 | 13 | < 0.1% |
| 500000 | 13 | < 0.1% |
| 1 | 12 | < 0.1% |
| Other values (6853) | 7263 | 16.0% |
| Value | Count | Frequency (%) |
| 0 | 37958 | |
| 1 | 12 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 5 | < 0.1% |
| 6 | 2 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2787965087 | 1 | |
| 2068223624 | 1 | |
| 1845034188 | 1 | |
| 1519557910 | 1 | |
| 1513528810 | 1 | |
| 1506249360 | 1 | |
| 1405403694 | 1 | |
| 1342000000 | 1 | |
| 1274219009 | 1 | |
| 1262886337 | 1 |
runtime
Real number (ℝ)
| Distinct | 353 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 246 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 94.181738 |
| Minimum | 0 |
|---|---|
| Maximum | 1256 |
| Zeros | 1534 |
| Zeros (%) | 3.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 12 |
| Q1 | 85 |
| median | 95 |
| Q3 | 107 |
| 95-th percentile | 138 |
| Maximum | 1256 |
| Range | 1256 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 38.34118 |
|---|---|
| Coefficient of variation (CV) | 0.40709781 |
| Kurtosis | 93.944905 |
| Mean | 94.181738 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 4.4919715 |
| Sum | 4249480 |
| Variance | 1470.0461 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 2548 | 5.6% |
| 0 | 1534 | 3.4% |
| 100 | 1470 | 3.2% |
| 95 | 1412 | 3.1% |
| 93 | 1213 | 2.7% |
| 96 | 1104 | 2.4% |
| 92 | 1078 | 2.4% |
| 94 | 1062 | 2.3% |
| 91 | 1055 | 2.3% |
| 88 | 1030 | 2.3% |
| Other values (343) | 31614 |
| Value | Count | Frequency (%) |
| 0 | 1534 | |
| 1 | 107 | 0.2% |
| 2 | 33 | 0.1% |
| 3 | 48 | 0.1% |
| 4 | 50 | 0.1% |
| 5 | 51 | 0.1% |
| 6 | 72 | 0.2% |
| 7 | 103 | 0.2% |
| 8 | 78 | 0.2% |
| 9 | 63 | 0.1% |
| Value | Count | Frequency (%) |
| 1256 | 1 | |
| 1140 | 2 | |
| 931 | 1 | |
| 925 | 1 | |
| 900 | 1 | |
| 877 | 1 | |
| 874 | 1 | |
| 840 | 2 | |
| 780 | 1 | |
| 720 | 1 |
spoken_languages
Categorical
HIGH CARDINALITY  IMBALANCE  MISSING 
| Distinct | 1841 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 3888 |
| Missing (%) | 8.6% |
| Memory size | 354.5 KiB |
| English | |
|---|---|
| Français | 1853 |
| 日本語 | 1291 |
| Italiano | 1217 |
| Español | 901 |
| Other values (1836) |
Length
| Max length | 171 |
|---|---|
| Median length | 7 |
| Mean length | 9.3976807 |
| Min length | 2 |
Characters and Unicode
| Total characters | 389797 |
|---|---|
| Distinct characters | 171 |
| Distinct categories | 8 ? |
| Distinct scripts | 15 ? |
| Distinct blocks | 16 ? |
Unique
| Unique | 1294 ? |
|---|---|
| Unique (%) | 3.1% |
Sample
| 1st row | English |
|---|---|
| 2nd row | English, Français |
| 3rd row | English |
| 4th row | English |
| 5th row | English |
Common Values
| Value | Count | Frequency (%) |
| English | 22377 | |
| Français | 1853 | 4.1% |
| 日本語 | 1291 | 2.8% |
| Italiano | 1217 | 2.7% |
| Español | 901 | 2.0% |
| Pусский | 807 | 1.8% |
| Deutsch | 760 | 1.7% |
| English, Français | 681 | 1.5% |
| English, Español | 572 | 1.3% |
| हिन्दी | 480 | 1.1% |
| Other values (1831) | 10539 | |
| (Missing) | 3888 | 8.6% |
Length
| Value | Count | Frequency (%) |
| english | 28725 | |
| français | 4194 | 7.7% |
| deutsch | 2623 | 4.8% |
| español | 2412 | 4.4% |
| italiano | 2366 | 4.4% |
| 日本語 | 1760 | 3.2% |
| pусский | 1562 | 2.9% |
| 普通话 | 790 | 1.5% |
| हिन्दी | 706 | 1.3% |
| 663 | 1.2% | |
| Other values (69) | 8559 | 15.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 42259 | |
| n | 37456 | 9.6% |
| i | 37103 | 9.5% |
| l | 34627 | 8.9% |
| h | 31454 | 8.1% |
| E | 31194 | 8.0% |
| g | 30409 | 7.8% |
| a | 18944 | 4.9% |
| 13076 | 3.4% | |
| , | 11663 | 3.0% |
| Other values (161) | 101612 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 291973 | |
| Uppercase Letter | 46421 | 11.9% |
| Other Letter | 22189 | 5.7% |
| Space Separator | 13076 | 3.4% |
| Other Punctuation | 12728 | 3.3% |
| Spacing Mark | 1836 | 0.5% |
| Nonspacing Mark | 1548 | 0.4% |
| Control | 26 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 42259 | |
| n | 37456 | |
| i | 37103 | |
| l | 34627 | |
| h | 31454 | |
| g | 30409 | |
| a | 18944 | |
| o | 7050 | 2.4% |
| r | 6127 | 2.1% |
| t | 5976 | 2.0% |
| Other values (63) | 40568 |
Other Letter
| Value | Count | Frequency (%) |
| 語 | 1760 | 7.9% |
| 本 | 1760 | 7.9% |
| 日 | 1760 | 7.9% |
| 话 | 1263 | 5.7% |
| 州 | 946 | 4.3% |
| 普 | 790 | 3.6% |
| 通 | 790 | 3.6% |
| न | 706 | 3.2% |
| द | 706 | 3.2% |
| ह | 706 | 3.2% |
| Other values (46) | 11002 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 31194 | |
| F | 4196 | 9.0% |
| D | 2924 | 6.3% |
| P | 2677 | 5.8% |
| I | 2366 | 5.1% |
| N | 828 | 1.8% |
| L | 505 | 1.1% |
| M | 362 | 0.8% |
| T | 308 | 0.7% |
| Č | 284 | 0.6% |
| Other values (13) | 777 | 1.7% |
Spacing Mark
| Value | Count | Frequency (%) |
| ी | 706 | |
| ि | 706 | |
| ు | 136 | 7.4% |
| ி | 111 | 6.0% |
| া | 94 | 5.1% |
| ং | 47 | 2.6% |
| ੀ | 18 | 1.0% |
| ਾ | 18 | 1.0% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ् | 706 | |
| ִ | 430 | |
| ְ | 215 | 13.9% |
| ் | 111 | 7.2% |
| ె | 68 | 4.4% |
| ੰ | 18 | 1.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 11663 | |
| / | 1015 | 8.0% |
| ? | 50 | 0.4% |
Space Separator
| Value | Count | Frequency (%) |
| 13076 |
Control
| Value | Count | Frequency (%) |
| | 26 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 326005 | |
| Common | 25830 | 6.6% |
| Han | 10488 | 2.7% |
| Cyrillic | 10454 | 2.7% |
| Devanagari | 4236 | 1.1% |
| Arabic | 3339 | 0.9% |
| Hangul | 3252 | 0.8% |
| Hebrew | 1720 | 0.4% |
| Greek | 1704 | 0.4% |
| Thai | 1232 | 0.3% |
| Other values (5) | 1537 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 42259 | |
| n | 37456 | |
| i | 37103 | |
| l | 34627 | |
| h | 31454 | |
| E | 31194 | |
| g | 30409 | |
| a | 18944 | 5.8% |
| o | 7050 | 2.2% |
| r | 6127 | 1.9% |
| Other values (50) | 49382 |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3211 | |
| к | 1734 | |
| и | 1679 | |
| й | 1615 | |
| у | 1564 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
Arabic
| Value | Count | Frequency (%) |
| ا | 536 | |
| ر | 536 | |
| ل | 341 | |
| ع | 341 | |
| ب | 341 | |
| ي | 341 | |
| ة | 341 | |
| ی | 140 | 4.2% |
| ف | 140 | 4.2% |
| س | 140 | 4.2% |
| Other values (5) | 142 | 4.3% |
Han
| Value | Count | Frequency (%) |
| 語 | 1760 | |
| 本 | 1760 | |
| 日 | 1760 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 广 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
| 話 | 473 | 4.5% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| י | 215 | |
| ר | 215 | |
| ְ | 215 | |
| ב | 215 | |
| ע | 215 |
Greek
| Value | Count | Frequency (%) |
| λ | 426 | |
| ά | 213 | |
| κ | 213 | |
| ι | 213 | |
| ν | 213 | |
| η | 213 | |
| ε | 213 |
Georgian
| Value | Count | Frequency (%) |
| ლ | 33 | |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ა | 33 | |
| ქ | 33 |
Devanagari
| Value | Count | Frequency (%) |
| न | 706 | |
| द | 706 | |
| ी | 706 | |
| ् | 706 | |
| ह | 706 | |
| ि | 706 |
Hangul
| Value | Count | Frequency (%) |
| 말 | 542 | |
| 조 | 542 | |
| 어 | 542 | |
| 국 | 542 | |
| 선 | 542 | |
| 한 | 542 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ท | 176 | |
| ย | 176 | |
| ไ | 176 | |
| ษ | 176 | |
| ภ | 176 |
Gurmukhi
| Value | Count | Frequency (%) |
| ੀ | 18 | |
| ਬ | 18 | |
| ਾ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
Common
| Value | Count | Frequency (%) |
| 13076 | ||
| , | 11663 | |
| / | 1015 | 3.9% |
| ? | 50 | 0.2% |
| | 26 | 0.1% |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| గ | 68 | |
| ల | 68 | |
| ె | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| த | 111 | |
| ி | 111 | |
| ழ | 111 | |
| ் | 111 | |
| ம | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ল | 47 | |
| ং | 47 | |
| ব | 47 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 342979 | |
| CJK | 10488 | 2.7% |
| Cyrillic | 10454 | 2.7% |
| None | 10434 | 2.7% |
| Devanagari | 4236 | 1.1% |
| Arabic | 3339 | 0.9% |
| Hangul | 3252 | 0.8% |
| Hebrew | 1720 | 0.4% |
| Thai | 1232 | 0.3% |
| Tamil | 555 | 0.1% |
| Other values (6) | 1108 | 0.3% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 42259 | |
| n | 37456 | |
| i | 37103 | |
| l | 34627 | |
| h | 31454 | |
| E | 31194 | |
| g | 30409 | |
| a | 18944 | 5.5% |
| 13076 | 3.8% | |
| , | 11663 | 3.4% |
| Other values (38) | 54794 |
None
| Value | Count | Frequency (%) |
| ç | 4441 | |
| ñ | 2412 | |
| ê | 591 | 5.7% |
| λ | 426 | 4.1% |
| ý | 284 | 2.7% |
| Č | 284 | 2.7% |
| ü | 247 | 2.4% |
| ά | 213 | 2.0% |
| κ | 213 | 2.0% |
| ι | 213 | 2.0% |
| Other values (11) | 1110 | 10.6% |
Cyrillic
| Value | Count | Frequency (%) |
| с | 3211 | |
| к | 1734 | |
| и | 1679 | |
| й | 1615 | |
| у | 1564 | |
| а | 113 | 1.1% |
| р | 87 | 0.8% |
| У | 53 | 0.5% |
| ї | 53 | 0.5% |
| н | 53 | 0.5% |
| Other values (12) | 292 | 2.8% |
CJK
| Value | Count | Frequency (%) |
| 語 | 1760 | |
| 本 | 1760 | |
| 日 | 1760 | |
| 话 | 1263 | |
| 州 | 946 | |
| 普 | 790 | |
| 通 | 790 | |
| 广 | 473 | 4.5% |
| 廣 | 473 | 4.5% |
| 話 | 473 | 4.5% |
Devanagari
| Value | Count | Frequency (%) |
| न | 706 | |
| द | 706 | |
| ी | 706 | |
| ् | 706 | |
| ह | 706 | |
| ि | 706 |
Hangul
| Value | Count | Frequency (%) |
| 말 | 542 | |
| 조 | 542 | |
| 어 | 542 | |
| 국 | 542 | |
| 선 | 542 | |
| 한 | 542 |
Arabic
| Value | Count | Frequency (%) |
| ا | 536 | |
| ر | 536 | |
| ل | 341 | |
| ع | 341 | |
| ب | 341 | |
| ي | 341 | |
| ة | 341 | |
| ی | 140 | 4.2% |
| ف | 140 | 4.2% |
| س | 140 | 4.2% |
| Other values (5) | 142 | 4.3% |
Hebrew
| Value | Count | Frequency (%) |
| ִ | 430 | |
| ת | 215 | |
| י | 215 | |
| ר | 215 | |
| ְ | 215 | |
| ב | 215 | |
| ע | 215 |
Thai
| Value | Count | Frequency (%) |
| า | 352 | |
| ท | 176 | |
| ย | 176 | |
| ไ | 176 | |
| ษ | 176 | |
| ภ | 176 |
Telugu
| Value | Count | Frequency (%) |
| ు | 136 | |
| గ | 68 | |
| ల | 68 | |
| ె | 68 | |
| త | 68 |
Tamil
| Value | Count | Frequency (%) |
| த | 111 | |
| ி | 111 | |
| ழ | 111 | |
| ் | 111 | |
| ம | 111 |
Bengali
| Value | Count | Frequency (%) |
| া | 94 | |
| ল | 47 | |
| ং | 47 | |
| ব | 47 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ế | 61 | |
| ệ | 61 |
Georgian
| Value | Count | Frequency (%) |
| ლ | 33 | |
| ი | 33 | |
| უ | 33 | |
| თ | 33 | |
| რ | 33 | |
| ა | 33 | |
| ქ | 33 |
Gurmukhi
| Value | Count | Frequency (%) |
| ੀ | 18 | |
| ਬ | 18 | |
| ਾ | 18 | |
| ਜ | 18 | |
| ੰ | 18 | |
| ਪ | 18 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 4 |
status
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 80 |
| Missing (%) | 0.2% |
| Memory size | 354.5 KiB |
| Released | |
|---|---|
| Rumored | 229 |
| Post Production | 97 |
| In Production | 19 |
| Planned | 13 |
Length
| Max length | 15 |
|---|---|
| Median length | 8 |
| Mean length | 8.0117476 |
| Min length | 7 |
Characters and Unicode
| Total characters | 362820 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Released |
|---|---|
| 2nd row | Released |
| 3rd row | Released |
| 4th row | Released |
| 5th row | Released |
Common Values
| Value | Count | Frequency (%) |
| Released | 44927 | |
| Rumored | 229 | 0.5% |
| Post Production | 97 | 0.2% |
| In Production | 19 | < 0.1% |
| Planned | 13 | < 0.1% |
| Canceled | 1 | < 0.1% |
| (Missing) | 80 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| released | 44927 | |
| rumored | 229 | 0.5% |
| production | 116 | 0.3% |
| post | 97 | 0.2% |
| in | 19 | < 0.1% |
| planned | 13 | < 0.1% |
| canceled | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 135025 | |
| d | 45286 | 12.5% |
| R | 45156 | 12.4% |
| s | 45024 | 12.4% |
| l | 44941 | 12.4% |
| a | 44941 | 12.4% |
| o | 558 | 0.2% |
| r | 345 | 0.1% |
| u | 345 | 0.1% |
| m | 229 | 0.1% |
| Other values (8) | 970 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 317302 | |
| Uppercase Letter | 45402 | 12.5% |
| Space Separator | 116 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 135025 | |
| d | 45286 | 14.3% |
| s | 45024 | 14.2% |
| l | 44941 | 14.2% |
| a | 44941 | 14.2% |
| o | 558 | 0.2% |
| r | 345 | 0.1% |
| u | 345 | 0.1% |
| m | 229 | 0.1% |
| t | 213 | 0.1% |
| Other values (3) | 395 | 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 45156 | |
| P | 226 | 0.5% |
| I | 19 | < 0.1% |
| C | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 116 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 362704 | |
| Common | 116 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 135025 | |
| d | 45286 | 12.5% |
| R | 45156 | 12.4% |
| s | 45024 | 12.4% |
| l | 44941 | 12.4% |
| a | 44941 | 12.4% |
| o | 558 | 0.2% |
| r | 345 | 0.1% |
| u | 345 | 0.1% |
| m | 229 | 0.1% |
| Other values (7) | 854 | 0.2% |
Common
| Value | Count | Frequency (%) |
| 116 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 362820 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 135025 | |
| d | 45286 | 12.5% |
| R | 45156 | 12.4% |
| s | 45024 | 12.4% |
| l | 44941 | 12.4% |
| a | 44941 | 12.4% |
| o | 558 | 0.2% |
| r | 345 | 0.1% |
| u | 345 | 0.1% |
| m | 229 | 0.1% |
| Other values (8) | 970 | 0.3% |
tagline
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 20269 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 24970 |
| Missing (%) | 55.0% |
| Memory size | 354.5 KiB |
| Based on a true story. | 7 |
|---|---|
| Some things are better left top secret. | 4 |
| Be careful what you wish for. | 4 |
| - | 4 |
| Trust no one. | 4 |
| Other values (20264) |
Length
| Max length | 297 |
|---|---|
| Median length | 204 |
| Mean length | 46.998333 |
| Min length | 1 |
Characters and Unicode
| Total characters | 958578 |
|---|---|
| Distinct characters | 170 |
| Distinct categories | 17 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 10 ? |
Unique
| Unique | 20166 ? |
|---|---|
| Unique (%) | 98.9% |
Sample
| 1st row | Roll the dice and unleash the excitement! |
|---|---|
| 2nd row | Still Yelling. Still Fighting. Still Ready for Love. |
| 3rd row | Friends are the people who let you be yourself... and never let you forget it. |
| 4th row | Just When His World Is Back To Normal... He's In For The Surprise Of His Life! |
| 5th row | A Los Angeles Crime Saga |
Common Values
| Value | Count | Frequency (%) |
| Based on a true story. | 7 | < 0.1% |
| Some things are better left top secret. | 4 | < 0.1% |
| Be careful what you wish for. | 4 | < 0.1% |
| - | 4 | < 0.1% |
| Trust no one. | 4 | < 0.1% |
| Documentary | 3 | < 0.1% |
| The end is near. | 3 | < 0.1% |
| There are two sides to every love story. | 3 | < 0.1% |
| Drama | 3 | < 0.1% |
| Classic Albums | 3 | < 0.1% |
| Other values (20259) | 20358 | |
| (Missing) | 24970 |
Length
| Value | Count | Frequency (%) |
| the | 10993 | 6.3% |
| a | 6812 | 3.9% |
| of | 4403 | 2.5% |
| to | 3582 | 2.1% |
| is | 2793 | 1.6% |
| in | 2693 | 1.5% |
| and | 2682 | 1.5% |
| you | 2389 | 1.4% |
| 1580 | 0.9% | |
| for | 1523 | 0.9% |
| Other values (15100) | 134460 |
Most occurring characters
| Value | Count | Frequency (%) |
| 153662 | ||
| e | 94404 | 9.8% |
| t | 57263 | 6.0% |
| o | 56557 | 5.9% |
| a | 51467 | 5.4% |
| n | 47494 | 5.0% |
| i | 46029 | 4.8% |
| r | 44978 | 4.7% |
| s | 42358 | 4.4% |
| h | 37161 | 3.9% |
| Other values (160) | 327205 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 680397 | |
| Space Separator | 153662 | 16.0% |
| Uppercase Letter | 74988 | 7.8% |
| Other Punctuation | 44582 | 4.7% |
| Decimal Number | 2687 | 0.3% |
| Dash Punctuation | 1942 | 0.2% |
| Final Punctuation | 98 | < 0.1% |
| Open Punctuation | 56 | < 0.1% |
| Close Punctuation | 55 | < 0.1% |
| Currency Symbol | 37 | < 0.1% |
| Other values (7) | 74 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 94404 | |
| t | 57263 | 8.4% |
| o | 56557 | 8.3% |
| a | 51467 | 7.6% |
| n | 47494 | 7.0% |
| i | 46029 | 6.8% |
| r | 44978 | 6.6% |
| s | 42358 | 6.2% |
| h | 37161 | 5.5% |
| l | 30172 | 4.4% |
| Other values (43) | 172514 |
Other Letter
| Value | Count | Frequency (%) |
| வ | 1 | 2.9% |
| ன | 1 | 2.9% |
| த | 1 | 2.9% |
| ஆ | 1 | 2.9% |
| 蜜 | 1 | 2.9% |
| 時 | 1 | 2.9% |
| 熟 | 1 | 2.9% |
| 成 | 1 | 2.9% |
| ナ | 1 | 2.9% |
| 劇 | 1 | 2.9% |
| Other values (24) | 24 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 10008 | 13.3% |
| A | 6873 | 9.2% |
| S | 5653 | 7.5% |
| H | 4402 | 5.9% |
| I | 4387 | 5.9% |
| E | 4306 | 5.7% |
| W | 3679 | 4.9% |
| O | 3477 | 4.6% |
| N | 3195 | 4.3% |
| L | 3194 | 4.3% |
| Other values (20) | 25814 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 26648 | |
| ! | 5784 | 13.0% |
| ' | 5674 | 12.7% |
| , | 4224 | 9.5% |
| ? | 1159 | 2.6% |
| " | 582 | 1.3% |
| … | 148 | 0.3% |
| : | 138 | 0.3% |
| & | 83 | 0.2% |
| * | 42 | 0.1% |
| Other values (7) | 100 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 802 | |
| 1 | 516 | |
| 2 | 299 | 11.1% |
| 9 | 208 | 7.7% |
| 3 | 208 | 7.7% |
| 5 | 168 | 6.3% |
| 4 | 140 | 5.2% |
| 6 | 121 | 4.5% |
| 7 | 121 | 4.5% |
| 8 | 104 | 3.9% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 5 | |
| + | 5 | |
| | | 2 | 14.3% |
| ~ | 1 | 7.1% |
| − | 1 | 7.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1925 | |
| – | 9 | 0.5% |
| — | 8 | 0.4% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 82 | |
| ” | 15 | 15.3% |
| » | 1 | 1.0% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 14 | |
| ‘ | 4 | 21.1% |
| « | 1 | 5.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 49 | |
| [ | 7 | 12.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 48 | |
| ] | 7 | 12.7% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 | |
| ² | 1 |
Modifier Letter
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 153662 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 37 |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ் | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 755385 | |
| Common | 203158 | 21.2% |
| Han | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 94404 | 12.5% |
| t | 57263 | 7.6% |
| o | 56557 | 7.5% |
| a | 51467 | 6.8% |
| n | 47494 | 6.3% |
| i | 46029 | 6.1% |
| r | 44978 | 6.0% |
| s | 42358 | 5.6% |
| h | 37161 | 4.9% |
| l | 30172 | 4.0% |
| Other values (73) | 247502 |
Common
| Value | Count | Frequency (%) |
| 153662 | ||
| . | 26648 | 13.1% |
| ! | 5784 | 2.8% |
| ' | 5674 | 2.8% |
| , | 4224 | 2.1% |
| - | 1925 | 0.9% |
| ? | 1159 | 0.6% |
| 0 | 802 | 0.4% |
| " | 582 | 0.3% |
| 1 | 516 | 0.3% |
| Other values (42) | 2182 | 1.1% |
Han
| Value | Count | Frequency (%) |
| 蜜 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 成 | 1 | 4.8% |
| 劇 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| 的 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| Other values (11) | 11 |
Tamil
| Value | Count | Frequency (%) |
| வ | 1 | |
| ன | 1 | |
| ் | 1 | |
| த | 1 | |
| ஆ | 1 |
Hiragana
| Value | Count | Frequency (%) |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 | |
| る | 1 |
Katakana
| Value | Count | Frequency (%) |
| ナ | 1 | |
| ク | 1 | |
| ラ | 1 | |
| ド | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 958148 | |
| Punctuation | 280 | < 0.1% |
| None | 110 | < 0.1% |
| CJK | 21 | < 0.1% |
| Tamil | 5 | < 0.1% |
| Hiragana | 5 | < 0.1% |
| Katakana | 4 | < 0.1% |
| IPA Ext | 2 | < 0.1% |
| Modifier Letters | 2 | < 0.1% |
| Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 153662 | ||
| e | 94404 | 9.9% |
| t | 57263 | 6.0% |
| o | 56557 | 5.9% |
| a | 51467 | 5.4% |
| n | 47494 | 5.0% |
| i | 46029 | 4.8% |
| r | 44978 | 4.7% |
| s | 42358 | 4.4% |
| h | 37161 | 3.9% |
| Other values (78) | 326775 |
Punctuation
| Value | Count | Frequency (%) |
| … | 148 | |
| ’ | 82 | |
| ” | 15 | 5.4% |
| “ | 14 | 5.0% |
| – | 9 | 3.2% |
| — | 8 | 2.9% |
| ‘ | 4 | 1.4% |
None
| Value | Count | Frequency (%) |
| é | 18 | |
| ä | 16 | |
| ö | 8 | 7.3% |
| ó | 6 | 5.5% |
| á | 6 | 5.5% |
| í | 5 | 4.5% |
| ı | 5 | 4.5% |
| ü | 5 | 4.5% |
| · | 4 | 3.6% |
| ñ | 3 | 2.7% |
| Other values (26) | 34 |
IPA Ext
| Value | Count | Frequency (%) |
| ə | 2 |
Tamil
| Value | Count | Frequency (%) |
| வ | 1 | |
| ன | 1 | |
| ் | 1 | |
| த | 1 | |
| ஆ | 1 |
CJK
| Value | Count | Frequency (%) |
| 蜜 | 1 | 4.8% |
| 時 | 1 | 4.8% |
| 熟 | 1 | 4.8% |
| 成 | 1 | 4.8% |
| 劇 | 1 | 4.8% |
| 場 | 1 | 4.8% |
| 版 | 1 | 4.8% |
| 舞 | 1 | 4.8% |
| 的 | 1 | 4.8% |
| 后 | 1 | 4.8% |
| Other values (11) | 11 |
Katakana
| Value | Count | Frequency (%) |
| ナ | 1 | |
| ク | 1 | |
| ラ | 1 | |
| ド | 1 |
Modifier Letters
| Value | Count | Frequency (%) |
| ˌ | 1 | |
| ˈ | 1 |
Hiragana
| Value | Count | Frequency (%) |
| は | 1 | |
| し | 1 | |
| て | 1 | |
| い | 1 | |
| る | 1 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
title
Categorical
HIGH CARDINALITY  UNIFORM 
| Distinct | 42195 |
|---|---|
| Distinct (%) | 93.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 354.5 KiB |
| Cinderella | 11 |
|---|---|
| Alice in Wonderland | 9 |
| Hamlet | 9 |
| Les Misérables | 8 |
| Beauty and the Beast | 8 |
| Other values (42190) |
Length
| Max length | 105 |
|---|---|
| Median length | 79 |
| Mean length | 16.703611 |
| Min length | 1 |
Characters and Unicode
| Total characters | 757776 |
|---|---|
| Distinct characters | 287 |
| Distinct categories | 17 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 12 ? |
Unique
| Unique | 39877 ? |
|---|---|
| Unique (%) | 87.9% |
Sample
| 1st row | Toy Story |
|---|---|
| 2nd row | Jumanji |
| 3rd row | Grumpier Old Men |
| 4th row | Waiting to Exhale |
| 5th row | Father of the Bride Part II |
Common Values
| Value | Count | Frequency (%) |
| Cinderella | 11 | < 0.1% |
| Alice in Wonderland | 9 | < 0.1% |
| Hamlet | 9 | < 0.1% |
| Les Misérables | 8 | < 0.1% |
| Beauty and the Beast | 8 | < 0.1% |
| The Three Musketeers | 7 | < 0.1% |
| Treasure Island | 7 | < 0.1% |
| A Christmas Carol | 7 | < 0.1% |
| The Hound of the Baskervilles | 6 | < 0.1% |
| Countdown | 6 | < 0.1% |
| Other values (42185) | 45288 |
Length
| Value | Count | Frequency (%) |
| the | 14550 | 10.7% |
| of | 4930 | 3.6% |
| a | 2243 | 1.6% |
| in | 1693 | 1.2% |
| and | 1631 | 1.2% |
| to | 1054 | 0.8% |
| 757 | 0.6% | |
| man | 665 | 0.5% |
| love | 664 | 0.5% |
| for | 601 | 0.4% |
| Other values (24353) | 107377 |
Most occurring characters
| Value | Count | Frequency (%) |
| 90821 | 12.0% | |
| e | 76236 | 10.1% |
| a | 48933 | 6.5% |
| o | 45664 | 6.0% |
| n | 40820 | 5.4% |
| r | 40005 | 5.3% |
| i | 39768 | 5.2% |
| t | 36716 | 4.8% |
| s | 29516 | 3.9% |
| h | 28508 | 3.8% |
| Other values (277) | 280789 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 534068 | |
| Uppercase Letter | 117247 | 15.5% |
| Space Separator | 90821 | 12.0% |
| Other Punctuation | 10487 | 1.4% |
| Decimal Number | 3858 | 0.5% |
| Dash Punctuation | 981 | 0.1% |
| Close Punctuation | 87 | < 0.1% |
| Open Punctuation | 85 | < 0.1% |
| Final Punctuation | 38 | < 0.1% |
| Other Letter | 25 | < 0.1% |
| Other values (7) | 79 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 76236 | |
| a | 48933 | |
| o | 45664 | 8.6% |
| n | 40820 | 7.6% |
| r | 40005 | 7.5% |
| i | 39768 | 7.4% |
| t | 36716 | 6.9% |
| s | 29516 | 5.5% |
| h | 28508 | 5.3% |
| l | 25927 | 4.9% |
| Other values (121) | 121975 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 16013 | |
| S | 10333 | 8.8% |
| M | 8032 | 6.9% |
| B | 7655 | 6.5% |
| C | 7170 | 6.1% |
| A | 6785 | 5.8% |
| D | 6334 | 5.4% |
| L | 5869 | 5.0% |
| H | 5170 | 4.4% |
| W | 5167 | 4.4% |
| Other values (65) | 38719 |
Other Letter
| Value | Count | Frequency (%) |
| چ | 2 | 8.0% |
| ه | 2 | 8.0% |
| ک | 2 | 8.0% |
| ی | 2 | 8.0% |
| 狗 | 1 | 4.0% |
| 貓 | 1 | 4.0% |
| ª | 1 | 4.0% |
| 時 | 1 | 4.0% |
| 傳 | 1 | 4.0% |
| 空 | 1 | 4.0% |
| Other values (11) | 11 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3717 | |
| ' | 2504 | |
| . | 1603 | |
| , | 1133 | 10.8% |
| ! | 647 | 6.2% |
| & | 458 | 4.4% |
| ? | 269 | 2.6% |
| / | 79 | 0.8% |
| * | 19 | 0.2% |
| # | 13 | 0.1% |
| Other values (8) | 45 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 861 | |
| 1 | 701 | |
| 0 | 616 | |
| 3 | 482 | |
| 9 | 232 | 6.0% |
| 4 | 229 | 5.9% |
| 5 | 227 | 5.9% |
| 7 | 193 | 5.0% |
| 8 | 161 | 4.2% |
| 6 | 156 | 4.0% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 17 | |
| × | 3 | 12.5% |
| ∞ | 1 | 4.2% |
| = | 1 | 4.2% |
| → | 1 | 4.2% |
| − | 1 | 4.2% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 12 | |
| ² | 3 | 15.8% |
| ³ | 2 | 10.5% |
| ⅓ | 1 | 5.3% |
| ⁴ | 1 | 5.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 3 | |
| ☆ | 2 | |
| ™ | 1 | 12.5% |
| ♡ | 1 | 12.5% |
| № | 1 | 12.5% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 18 | |
| ¢ | 2 | 9.5% |
| £ | 1 | 4.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 966 | |
| – | 15 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 82 | |
| ] | 5 | 5.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 80 | |
| [ | 5 | 5.9% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| ” | 1 | 2.6% |
Initial Punctuation
| Value | Count | Frequency (%) |
| ‘ | 1 | |
| “ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 90821 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Format
| Value | Count | Frequency (%) |
| | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 650800 | |
| Common | 106436 | 14.0% |
| Cyrillic | 346 | < 0.1% |
| Greek | 170 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| Han | 5 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 76236 | 11.7% |
| a | 48933 | 7.5% |
| o | 45664 | 7.0% |
| n | 40820 | 6.3% |
| r | 40005 | 6.1% |
| i | 39768 | 6.1% |
| t | 36716 | 5.6% |
| s | 29516 | 4.5% |
| h | 28508 | 4.4% |
| l | 25927 | 4.0% |
| Other values (107) | 238707 |
Common
| Value | Count | Frequency (%) |
| 90821 | ||
| : | 3717 | 3.5% |
| ' | 2504 | 2.4% |
| . | 1603 | 1.5% |
| , | 1133 | 1.1% |
| - | 966 | 0.9% |
| 2 | 861 | 0.8% |
| 1 | 701 | 0.7% |
| ! | 647 | 0.6% |
| 0 | 616 | 0.6% |
| Other values (50) | 2867 | 2.7% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 32 | 9.2% |
| е | 32 | 9.2% |
| а | 29 | 8.4% |
| н | 24 | 6.9% |
| и | 23 | 6.6% |
| р | 22 | 6.4% |
| к | 17 | 4.9% |
| с | 15 | 4.3% |
| т | 14 | 4.0% |
| в | 14 | 4.0% |
| Other values (38) | 124 |
Greek
| Value | Count | Frequency (%) |
| α | 20 | 11.8% |
| ο | 14 | 8.2% |
| ι | 14 | 8.2% |
| τ | 9 | 5.3% |
| ρ | 8 | 4.7% |
| ά | 8 | 4.7% |
| λ | 8 | 4.7% |
| ν | 7 | 4.1% |
| ς | 6 | 3.5% |
| ε | 6 | 3.5% |
| Other values (32) | 70 |
Katakana
| Value | Count | Frequency (%) |
| テ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| ァ | 1 | |
| フ | 1 |
Arabic
| Value | Count | Frequency (%) |
| چ | 2 | |
| ه | 2 | |
| ک | 2 | |
| ی | 2 | |
| س | 1 | |
| ا | 1 | |
| ج | 1 |
Han
| Value | Count | Frequency (%) |
| 狗 | 1 | |
| 貓 | 1 | |
| 時 | 1 | |
| 傳 | 1 | |
| 空 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 756212 | |
| None | 1123 | 0.1% |
| Cyrillic | 346 | < 0.1% |
| Punctuation | 62 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| CJK | 5 | < 0.1% |
| Misc Symbols | 3 | < 0.1% |
| Letterlike Symbols | 2 | < 0.1% |
| Math Operators | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 90821 | 12.0% | |
| e | 76236 | 10.1% |
| a | 48933 | 6.5% |
| o | 45664 | 6.0% |
| n | 40820 | 5.4% |
| r | 40005 | 5.3% |
| i | 39768 | 5.3% |
| t | 36716 | 4.9% |
| s | 29516 | 3.9% |
| h | 28508 | 3.8% |
| Other values (76) | 279225 |
None
| Value | Count | Frequency (%) |
| é | 218 | |
| ä | 127 | 11.3% |
| ö | 55 | 4.9% |
| è | 53 | 4.7% |
| ô | 44 | 3.9% |
| ü | 39 | 3.5% |
| ó | 37 | 3.3% |
| á | 35 | 3.1% |
| ı | 35 | 3.1% |
| í | 33 | 2.9% |
| Other values (108) | 447 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| – | 15 | |
| … | 5 | 8.1% |
| | 2 | 3.2% |
| ‘ | 1 | 1.6% |
| ” | 1 | 1.6% |
| “ | 1 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 32 | 9.2% |
| е | 32 | 9.2% |
| а | 29 | 8.4% |
| н | 24 | 6.9% |
| и | 23 | 6.6% |
| р | 22 | 6.4% |
| к | 17 | 4.9% |
| с | 15 | 4.3% |
| т | 14 | 4.0% |
| в | 14 | 4.0% |
| Other values (38) | 124 |
Arabic
| Value | Count | Frequency (%) |
| چ | 2 | |
| ه | 2 | |
| ک | 2 | |
| ی | 2 | |
| س | 1 | |
| ا | 1 | |
| ج | 1 |
Misc Symbols
| Value | Count | Frequency (%) |
| ☆ | 2 | |
| ♡ | 1 |
CJK
| Value | Count | Frequency (%) |
| 狗 | 1 | |
| 貓 | 1 | |
| 時 | 1 | |
| 傳 | 1 | |
| 空 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| ⅓ | 1 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 1 | |
| № | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ∞ | 1 | |
| − | 1 |
Katakana
| Value | Count | Frequency (%) |
| テ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| ァ | 1 | |
| フ | 1 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
vote_average
Real number (ℝ)
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.6239673 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 2947 |
| Zeros (%) | 6.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6.8 |
| 95-th percentile | 7.8 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 1.9155471 |
|---|---|
| Coefficient of variation (CV) | 0.34060424 |
| Kurtosis | 2.5414062 |
| Mean | 5.6239673 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -1.5244317 |
| Sum | 255136.9 |
| Variance | 3.6693206 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2947 | 6.5% |
| 6 | 2462 | 5.4% |
| 5 | 1996 | 4.4% |
| 7 | 1885 | 4.2% |
| 6.5 | 1722 | 3.8% |
| 6.3 | 1602 | 3.5% |
| 5.5 | 1381 | 3.0% |
| 5.8 | 1369 | 3.0% |
| 6.4 | 1349 | 3.0% |
| 6.7 | 1339 | 3.0% |
| Other values (82) | 27314 |
| Value | Count | Frequency (%) |
| 0 | 2947 | |
| 0.5 | 13 | < 0.1% |
| 0.7 | 1 | < 0.1% |
| 1 | 103 | 0.2% |
| 1.1 | 1 | < 0.1% |
| 1.2 | 4 | < 0.1% |
| 1.3 | 13 | < 0.1% |
| 1.4 | 5 | < 0.1% |
| 1.5 | 30 | 0.1% |
| 1.6 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 185 | |
| 9.8 | 1 | < 0.1% |
| 9.6 | 1 | < 0.1% |
| 9.5 | 18 | < 0.1% |
| 9.4 | 3 | < 0.1% |
| 9.3 | 18 | < 0.1% |
| 9.2 | 4 | < 0.1% |
| 9.1 | 2 | < 0.1% |
| 9 | 158 | |
| 8.9 | 7 | < 0.1% |
vote_count
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1820 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 110.11824 |
| Minimum | 0 |
|---|---|
| Maximum | 14075 |
| Zeros | 2849 |
| Zeros (%) | 6.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 10 |
| Q3 | 34 |
| 95-th percentile | 434 |
| Maximum | 14075 |
| Range | 14075 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 491.79559 |
|---|---|
| Coefficient of variation (CV) | 4.4660684 |
| Kurtosis | 150.89469 |
| Mean | 110.11824 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 10.439597 |
| Sum | 4995624 |
| Variance | 241862.9 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 3241 | 7.1% |
| 2 | 3127 | 6.9% |
| 0 | 2849 | 6.3% |
| 3 | 2781 | 6.1% |
| 4 | 2477 | 5.5% |
| 5 | 2096 | 4.6% |
| 6 | 1747 | 3.9% |
| 7 | 1570 | 3.5% |
| 8 | 1359 | 3.0% |
| 9 | 1194 | 2.6% |
| Other values (1810) | 22925 |
| Value | Count | Frequency (%) |
| 0 | 2849 | |
| 1 | 3241 | |
| 2 | 3127 | |
| 3 | 2781 | |
| 4 | 2477 | |
| 5 | 2096 | |
| 6 | 1747 | |
| 7 | 1570 | |
| 8 | 1359 | |
| 9 | 1194 | 2.6% |
| Value | Count | Frequency (%) |
| 14075 | 1 | |
| 12269 | 1 | |
| 12114 | 1 | |
| 12000 | 1 | |
| 11444 | 1 | |
| 11187 | 1 | |
| 10297 | 1 | |
| 10014 | 1 | |
| 9678 | 1 | |
| 9634 | 1 |
cast
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 42656 |
|---|---|
| Distinct (%) | 99.2% |
| Missing | 2348 |
| Missing (%) | 5.2% |
| Memory size | 354.5 KiB |
| Georges Méliès | 24 |
|---|---|
| Louis Theroux | 15 |
| Mel Blanc | 12 |
| Jimmy Carr | 9 |
| George Carlin | 8 |
| Other values (42651) |
Length
| Max length | 4551 |
|---|---|
| Median length | 1364 |
| Mean length | 198.06544 |
| Min length | 4 |
Characters and Unicode
| Total characters | 8520379 |
|---|---|
| Distinct characters | 395 |
| Distinct categories | 16 ? |
| Distinct scripts | 9 ? |
| Distinct blocks | 10 ? |
Unique
| Unique | 42475 ? |
|---|---|
| Unique (%) | 98.7% |
Sample
| 1st row | Tom Hanks, Tim Allen, Don Rickles, Jim Varney, Wallace Shawn, John Ratzenberger, Annie Potts, John Morris, Erik von Detten, Laurie Metcalf, R. Lee Ermey, Sarah Freeman, Penn Jillette |
|---|---|
| 2nd row | Robin Williams, Jonathan Hyde, Kirsten Dunst, Bradley Pierce, Bonnie Hunt, Bebe Neuwirth, David Alan Grier, Patricia Clarkson, Adam Hann-Byrd, Laura Bell Bundy, James Handy, Gillian Barber, Brandon Obray, Cyrus Thiedeke, Gary Joseph Thorup, Leonard Zola, Lloyd Berry, Malcolm Stewart, Annabel Kershaw, Darryl Henriques, Robyn Driscoll, Peter Bryant, Sarah Gilson, Florica Vlad, June Lion, Brenda Lockmuller |
| 3rd row | Walter Matthau, Jack Lemmon, Ann-Margret, Sophia Loren, Daryl Hannah, Burgess Meredith, Kevin Pollak |
| 4th row | Whitney Houston, Angela Bassett, Loretta Devine, Lela Rochon, Gregory Hines, Dennis Haysbert, Michael Beach, Mykelti Williamson, Lamont Johnson, Wesley Snipes |
| 5th row | Steve Martin, Diane Keaton, Martin Short, Kimberly Williams-Paisley, George Newbern, Kieran Culkin, BD Wong, Peter Michael Goetz, Kate McGregor-Stewart, Jane Adams, Eugene Levy, Lori Alan |
Common Values
| Value | Count | Frequency (%) |
| Georges Méliès | 24 | 0.1% |
| Louis Theroux | 15 | < 0.1% |
| Mel Blanc | 12 | < 0.1% |
| Jimmy Carr | 9 | < 0.1% |
| George Carlin | 8 | < 0.1% |
| Werner Herzog | 8 | < 0.1% |
| David Attenborough | 8 | < 0.1% |
| Louis C.K. | 8 | < 0.1% |
| Ricky Gervais | 6 | < 0.1% |
| Trevor Noah | 6 | < 0.1% |
| Other values (42646) | 42914 | |
| (Missing) | 2348 | 5.2% |
Length
| Value | Count | Frequency (%) |
| john | 9804 | 0.8% |
| michael | 7458 | 0.6% |
| david | 6185 | 0.5% |
| robert | 5722 | 0.5% |
| james | 5689 | 0.5% |
| richard | 4446 | 0.4% |
| paul | 4313 | 0.4% |
| peter | 3901 | 0.3% |
| william | 3431 | 0.3% |
| george | 3416 | 0.3% |
| Other values (112933) | 1110657 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1122132 | 13.2% | |
| a | 704925 | 8.3% |
| e | 665316 | 7.8% |
| n | 524106 | 6.2% |
| , | 519485 | 6.1% |
| r | 497363 | 5.8% |
| i | 484022 | 5.7% |
| o | 423803 | 5.0% |
| l | 366466 | 4.3% |
| s | 255868 | 3.0% |
| Other values (385) | 2956893 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5651100 | |
| Uppercase Letter | 1190431 | 14.0% |
| Space Separator | 1122135 | 13.2% |
| Other Punctuation | 541788 | 6.4% |
| Dash Punctuation | 14101 | 0.2% |
| Other Letter | 543 | < 0.1% |
| Decimal Number | 94 | < 0.1% |
| Final Punctuation | 83 | < 0.1% |
| Initial Punctuation | 23 | < 0.1% |
| Open Punctuation | 23 | < 0.1% |
| Other values (6) | 58 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 704925 | |
| e | 665316 | |
| n | 524106 | |
| r | 497363 | 8.8% |
| i | 484022 | 8.6% |
| o | 423803 | 7.5% |
| l | 366466 | 6.5% |
| s | 255868 | 4.5% |
| t | 253211 | 4.5% |
| h | 197885 | 3.5% |
| Other values (138) | 1278135 |
Other Letter
| Value | Count | Frequency (%) |
| ا | 32 | 5.9% |
| م | 31 | 5.7% |
| ی | 19 | 3.5% |
| ع | 19 | 3.5% |
| ن | 18 | 3.3% |
| ر | 17 | 3.1% |
| 松 | 17 | 3.1% |
| د | 17 | 3.1% |
| ي | 16 | 2.9% |
| 美 | 12 | 2.2% |
| Other values (104) | 345 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 109353 | 9.2% |
| S | 92313 | 7.8% |
| C | 84003 | 7.1% |
| J | 83331 | 7.0% |
| B | 82353 | 6.9% |
| A | 70824 | 5.9% |
| R | 67394 | 5.7% |
| D | 65885 | 5.5% |
| L | 61163 | 5.1% |
| G | 54661 | 4.6% |
| Other values (81) | 419151 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 37 | |
| 0 | 29 | |
| 1 | 8 | 8.5% |
| 2 | 8 | 8.5% |
| 9 | 4 | 4.3% |
| 7 | 2 | 2.1% |
| 3 | 2 | 2.1% |
| 4 | 2 | 2.1% |
| 8 | 1 | 1.1% |
| 6 | 1 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 519485 | |
| . | 16049 | 3.0% |
| ' | 6098 | 1.1% |
| " | 129 | < 0.1% |
| · | 9 | < 0.1% |
| & | 6 | < 0.1% |
| : | 6 | < 0.1% |
| ! | 5 | < 0.1% |
| / | 1 | < 0.1% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ́ | 10 | |
| ิ | 2 | 11.8% |
| ี | 1 | 5.9% |
| ์ | 1 | 5.9% |
| ั | 1 | 5.9% |
| ่ | 1 | 5.9% |
| ึ | 1 | 5.9% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 74 | |
| ” | 6 | 7.2% |
| » | 3 | 3.6% |
Space Separator
| Value | Count | Frequency (%) |
| 1122132 | ||
| 3 | < 0.1% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 20 | |
| « | 3 | 13.0% |
Open Punctuation
| Value | Count | Frequency (%) |
| „ | 14 | |
| ( | 9 |
Format
| Value | Count | Frequency (%) |
| | 5 | |
| | 1 | 16.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 14101 |
Control
| Value | Count | Frequency (%) |
| 21 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 9 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 3 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6838447 | |
| Common | 1678287 | 19.7% |
| Cyrillic | 3070 | < 0.1% |
| Han | 276 | < 0.1% |
| Arabic | 241 | < 0.1% |
| Thai | 27 | < 0.1% |
| Greek | 14 | < 0.1% |
| Inherited | 11 | < 0.1% |
| Hangul | 6 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 704925 | 10.3% |
| e | 665316 | 9.7% |
| n | 524106 | 7.7% |
| r | 497363 | 7.3% |
| i | 484022 | 7.1% |
| o | 423803 | 6.2% |
| l | 366466 | 5.4% |
| s | 255868 | 3.7% |
| t | 253211 | 3.7% |
| h | 197885 | 2.9% |
| Other values (163) | 2465482 |
Han
| Value | Count | Frequency (%) |
| 松 | 17 | 6.2% |
| 美 | 12 | 4.3% |
| 长 | 11 | 4.0% |
| 平 | 11 | 4.0% |
| 龙 | 11 | 4.0% |
| 田 | 11 | 4.0% |
| 泽 | 11 | 4.0% |
| 雅 | 11 | 4.0% |
| 杰 | 9 | 3.3% |
| 森 | 9 | 3.3% |
| Other values (55) | 163 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 323 | 10.5% |
| и | 315 | 10.3% |
| о | 233 | 7.6% |
| н | 229 | 7.5% |
| р | 215 | 7.0% |
| е | 174 | 5.7% |
| л | 155 | 5.0% |
| к | 136 | 4.4% |
| т | 115 | 3.7% |
| с | 109 | 3.6% |
| Other values (51) | 1066 |
Common
| Value | Count | Frequency (%) |
| 1122132 | ||
| , | 519485 | |
| . | 16049 | 1.0% |
| - | 14101 | 0.8% |
| ' | 6098 | 0.4% |
| " | 129 | < 0.1% |
| ’ | 74 | < 0.1% |
| 5 | 37 | < 0.1% |
| 0 | 29 | < 0.1% |
| 21 | < 0.1% | |
| Other values (24) | 132 | < 0.1% |
Arabic
| Value | Count | Frequency (%) |
| ا | 32 | |
| م | 31 | |
| ی | 19 | 7.9% |
| ع | 19 | 7.9% |
| ن | 18 | 7.5% |
| ر | 17 | 7.1% |
| د | 17 | 7.1% |
| ي | 16 | 6.6% |
| ل | 9 | 3.7% |
| س | 8 | 3.3% |
| Other values (18) | 55 |
Thai
| Value | Count | Frequency (%) |
| า | 2 | 7.4% |
| ง | 2 | 7.4% |
| น | 2 | 7.4% |
| ร | 2 | 7.4% |
| ว | 2 | 7.4% |
| ิ | 2 | 7.4% |
| ี | 1 | 3.7% |
| ศ | 1 | 3.7% |
| ธ | 1 | 3.7% |
| ค | 1 | 3.7% |
| Other values (11) | 11 |
Hangul
| Value | Count | Frequency (%) |
| 조 | 1 | |
| 병 | 1 | |
| 열 | 1 | |
| 계 | 1 | |
| 강 | 1 | |
| 만 | 1 |
Greek
| Value | Count | Frequency (%) |
| ν | 6 | |
| ί | 2 | 14.3% |
| Ζ | 2 | 14.3% |
| α | 2 | 14.3% |
| ο | 2 | 14.3% |
Inherited
| Value | Count | Frequency (%) |
| ́ | 10 | |
| | 1 | 9.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8478314 | |
| None | 38259 | 0.4% |
| Cyrillic | 3070 | < 0.1% |
| CJK | 276 | < 0.1% |
| Arabic | 241 | < 0.1% |
| Punctuation | 120 | < 0.1% |
| Latin Ext Additional | 56 | < 0.1% |
| Thai | 27 | < 0.1% |
| Diacriticals | 10 | < 0.1% |
| Hangul | 6 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1122132 | 13.2% | |
| a | 704925 | 8.3% |
| e | 665316 | 7.8% |
| n | 524106 | 6.2% |
| , | 519485 | 6.1% |
| r | 497363 | 5.9% |
| i | 484022 | 5.7% |
| o | 423803 | 5.0% |
| l | 366466 | 4.3% |
| s | 255868 | 3.0% |
| Other values (66) | 2914828 |
None
| Value | Count | Frequency (%) |
| é | 9079 | |
| á | 4155 | 10.9% |
| í | 2756 | 7.2% |
| ô | 2333 | 6.1% |
| ö | 2014 | 5.3% |
| ó | 1881 | 4.9% |
| ü | 1492 | 3.9% |
| ć | 1360 | 3.6% |
| è | 1243 | 3.2% |
| ä | 994 | 2.6% |
| Other values (111) | 10952 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 323 | 10.5% |
| и | 315 | 10.3% |
| о | 233 | 7.6% |
| н | 229 | 7.5% |
| р | 215 | 7.0% |
| е | 174 | 5.7% |
| л | 155 | 5.0% |
| к | 136 | 4.4% |
| т | 115 | 3.7% |
| с | 109 | 3.6% |
| Other values (51) | 1066 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 74 | |
| “ | 20 | 16.7% |
| „ | 14 | 11.7% |
| ” | 6 | 5.0% |
| | 5 | 4.2% |
| | 1 | 0.8% |
Arabic
| Value | Count | Frequency (%) |
| ا | 32 | |
| م | 31 | |
| ی | 19 | 7.9% |
| ع | 19 | 7.9% |
| ن | 18 | 7.5% |
| ر | 17 | 7.1% |
| د | 17 | 7.1% |
| ي | 16 | 6.6% |
| ل | 9 | 3.7% |
| س | 8 | 3.3% |
| Other values (18) | 55 |
CJK
| Value | Count | Frequency (%) |
| 松 | 17 | 6.2% |
| 美 | 12 | 4.3% |
| 长 | 11 | 4.0% |
| 平 | 11 | 4.0% |
| 龙 | 11 | 4.0% |
| 田 | 11 | 4.0% |
| 泽 | 11 | 4.0% |
| 雅 | 11 | 4.0% |
| 杰 | 9 | 3.3% |
| 森 | 9 | 3.3% |
| Other values (55) | 163 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ễ | 15 | |
| ạ | 9 | |
| ỳ | 6 | 10.7% |
| ị | 6 | 10.7% |
| ế | 5 | 8.9% |
| ỗ | 4 | 7.1% |
| ề | 4 | 7.1% |
| ả | 4 | 7.1% |
| ầ | 2 | 3.6% |
| ố | 1 | 1.8% |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 10 |
Thai
| Value | Count | Frequency (%) |
| า | 2 | 7.4% |
| ง | 2 | 7.4% |
| น | 2 | 7.4% |
| ร | 2 | 7.4% |
| ว | 2 | 7.4% |
| ิ | 2 | 7.4% |
| ี | 1 | 3.7% |
| ศ | 1 | 3.7% |
| ธ | 1 | 3.7% |
| ค | 1 | 3.7% |
| Other values (11) | 11 |
Hangul
| Value | Count | Frequency (%) |
| 조 | 1 | |
| 병 | 1 | |
| 열 | 1 | |
| 계 | 1 | |
| 강 | 1 | |
| 만 | 1 |
crew
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 42943 |
|---|---|
| Distinct (%) | 96.2% |
| Missing | 723 |
| Missing (%) | 1.6% |
| Memory size | 354.5 KiB |
| Director: Georges Méliès | 35 |
|---|---|
| Director: Christian I. Nyby II | 13 |
| Director: Norman McLaren | 12 |
| Director: Charlie Chaplin, Writer: Charlie Chaplin | 12 |
| Director: Frederick Wiseman | 12 |
| Other values (42938) |
Length
| Max length | 5043 |
|---|---|
| Median length | 3354 |
| Mean length | 233.66304 |
| Min length | 11 |
Characters and Unicode
| Total characters | 10431419 |
|---|---|
| Distinct characters | 333 |
| Distinct categories | 15 ? |
| Distinct scripts | 8 ? |
| Distinct blocks | 9 ? |
Unique
| Unique | 41852 ? |
|---|---|
| Unique (%) | 93.7% |
Sample
| 1st row | Director: John Lasseter, Screenplay: Alec Sokolow, Producer: Ralph Guggenheim, Executive Producer: Steve Jobs, Editor: Robert Gordon, Art Direction: Ralph Eggleston, Foley Editor: Mary Helen Leasman, Animation: Ken Willard, ADR Editor: Marilyn McCoppen, Orchestrator: Don Davis, Color Timer: Dale E. Grahn, CG Painter: William Cone, Original Story: Andrew Stanton, Post Production Supervisor: Patsy Bouge, Sculptor: Shelley Daniels Lekven, Animation Director: Rich Quade, Music: Randy Newman, Layout: Desirée Mourad, Music Editor: James Flamberg, Negative Cutter: Rick Mackay, Title Designer: Susan Bradley, Supervising Technical Director: William Reeves, Songs: Randy Newman, Supervising Animator: Pete Docter, Sound Designer: Gary Rydstrom, Production Supervisor: Karen Robert Jackson, Executive Music Producer: Chris Montan, Visual Effects Supervisor: Thomas Porter, Visual Effects: Brian M. Rosen, Lighting Supervisor: Galyn Susman, Character Designer: Jean Gillmore, Set Dresser: Ann M. Rockwell, Editorial Manager: Julie M. McDonald, Assistant Editor: Dana Mulligan, Editorial Coordinator: Deirdre Morrison, Production Coordinator: Ellen Devine, Unit Publicist: Lauren Beth Strogoff, Sound Re-Recording Mixer: Gary Summers, Supervising Sound Editor: Tim Holland, Sound Effects Editor: Pat Jackson, Sound Design Assistant: Tom Myers, Assistant Sound Editor: Dan Engstrom, Casting Consultant: Ruth Lambert, ADR Voice Casting: Mickie McGowan |
|---|---|
| 2nd row | Executive Producer: Robert W. Cort, Screenplay: Jim Strain, Original Music Composer: James Horner, Director: Joe Johnston, Editor: Robert Dalva, Casting: Nancy Foy, Animation Supervisor: Kyle Balda, Production Design: James D. Bissell, Producer: William Teitler, Director of Photography: Thomas E. Ackerman, Novel: Chris van Allsburg |
| 3rd row | Director: Howard Deutch, Characters: Mark Steven Johnson, Writer: Mark Steven Johnson, Sound Recordist: Jack Keller |
| 4th row | Director: Forest Whitaker, Screenplay: Terry McMillan, Producer: Caron K, Executive Producer: Terry McMillan, Novel: Terry McMillan, Original Music Composer: Kenneth Edmonds |
| 5th row | Original Music Composer: Alan Silvestri, Director of Photography: Elliot Davis, Screenplay: Albert Hackett, Producer: Nancy Meyers, Director: Charles Shyer, Editor: Adam Bernardi |
Common Values
| Value | Count | Frequency (%) |
| Director: Georges Méliès | 35 | 0.1% |
| Director: Christian I. Nyby II | 13 | < 0.1% |
| Director: Norman McLaren | 12 | < 0.1% |
| Director: Charlie Chaplin, Writer: Charlie Chaplin | 12 | < 0.1% |
| Director: Frederick Wiseman | 12 | < 0.1% |
| Director: Gerald Thomas, Screenplay: Talbot Rothwell | 11 | < 0.1% |
| Director: Stan Brakhage | 10 | < 0.1% |
| Director: James H. White | 10 | < 0.1% |
| Director: James Benning | 10 | < 0.1% |
| Director: William K.L. Dickson | 9 | < 0.1% |
| Other values (42933) | 44509 | |
| (Missing) | 723 | 1.6% |
Length
| Value | Count | Frequency (%) |
| director | 69179 | 5.3% |
| producer | 36150 | 2.7% |
| editor | 30798 | 2.3% |
| music | 23802 | 1.8% |
| writer | 20800 | 1.6% |
| design | 20256 | 1.5% |
| of | 19577 | 1.5% |
| photography | 19508 | 1.5% |
| production | 17620 | 1.3% |
| screenplay | 15719 | 1.2% |
| Other values (78871) | 1041517 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1270337 | 12.2% | |
| r | 863480 | 8.3% |
| e | 795230 | 7.6% |
| o | 677278 | 6.5% |
| i | 676464 | 6.5% |
| a | 594616 | 5.7% |
| t | 523212 | 5.0% |
| n | 497393 | 4.8% |
| : | 344604 | 3.3% |
| s | 343041 | 3.3% |
| Other values (323) | 3845764 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7152183 | |
| Uppercase Letter | 1316789 | 12.6% |
| Space Separator | 1270337 | 12.2% |
| Other Punctuation | 679118 | 6.5% |
| Dash Punctuation | 12365 | 0.1% |
| Decimal Number | 265 | < 0.1% |
| Other Letter | 163 | < 0.1% |
| Control | 151 | < 0.1% |
| Open Punctuation | 16 | < 0.1% |
| Close Punctuation | 16 | < 0.1% |
| Other values (5) | 16 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 863480 | |
| e | 795230 | |
| o | 677278 | |
| i | 676464 | |
| a | 594616 | 8.3% |
| t | 523212 | 7.3% |
| n | 497393 | 7.0% |
| s | 343041 | 4.8% |
| c | 322132 | 4.5% |
| l | 286665 | 4.0% |
| Other values (122) | 1572672 |
Other Letter
| Value | Count | Frequency (%) |
| ا | 9 | 5.5% |
| م | 7 | 4.3% |
| 이 | 7 | 4.3% |
| 진 | 7 | 4.3% |
| 정 | 5 | 3.1% |
| د | 5 | 3.1% |
| 모 | 4 | 2.5% |
| 연 | 4 | 2.5% |
| 아 | 4 | 2.5% |
| ع | 4 | 2.5% |
| Other values (76) | 107 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 162117 | |
| S | 140855 | 10.7% |
| P | 118060 | 9.0% |
| C | 111694 | 8.5% |
| M | 109905 | 8.3% |
| A | 80737 | 6.1% |
| E | 72375 | 5.5% |
| J | 55130 | 4.2% |
| R | 53353 | 4.1% |
| B | 52132 | 4.0% |
| Other values (75) | 360431 |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 183 | |
| 2 | 37 | 14.0% |
| 4 | 18 | 6.8% |
| 0 | 8 | 3.0% |
| 5 | 7 | 2.6% |
| 8 | 4 | 1.5% |
| 9 | 4 | 1.5% |
| 7 | 3 | 1.1% |
| 1 | 1 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 344604 | |
| , | 300060 | |
| . | 31072 | 4.6% |
| ' | 2561 | 0.4% |
| & | 698 | 0.1% |
| / | 98 | < 0.1% |
| " | 24 | < 0.1% |
| · | 1 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 12362 | |
| – | 3 | < 0.1% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 6 | |
| ” | 2 | 25.0% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ̃ | 2 | |
| ́ | 2 |
Space Separator
| Value | Count | Frequency (%) |
| 1270337 |
Control
| Value | Count | Frequency (%) |
| 151 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 16 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 16 |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 2 |
Math Symbol
| Value | Count | Frequency (%) |
| | | 1 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8468205 | |
| Common | 1962281 | 18.8% |
| Cyrillic | 749 | < 0.1% |
| Hangul | 98 | < 0.1% |
| Arabic | 52 | < 0.1% |
| Greek | 17 | < 0.1% |
| Han | 13 | < 0.1% |
| Inherited | 4 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 863480 | 10.2% |
| e | 795230 | 9.4% |
| o | 677278 | 8.0% |
| i | 676464 | 8.0% |
| a | 594616 | 7.0% |
| t | 523212 | 6.2% |
| n | 497393 | 5.9% |
| s | 343041 | 4.1% |
| c | 322132 | 3.8% |
| l | 286665 | 3.4% |
| Other values (143) | 2888694 |
Hangul
| Value | Count | Frequency (%) |
| 이 | 7 | 7.1% |
| 진 | 7 | 7.1% |
| 정 | 5 | 5.1% |
| 모 | 4 | 4.1% |
| 연 | 4 | 4.1% |
| 아 | 4 | 4.1% |
| 영 | 3 | 3.1% |
| 박 | 3 | 3.1% |
| 현 | 3 | 3.1% |
| 성 | 3 | 3.1% |
| Other values (46) | 55 |
Cyrillic
| Value | Count | Frequency (%) |
| и | 86 | 11.5% |
| а | 70 | 9.3% |
| р | 53 | 7.1% |
| о | 49 | 6.5% |
| л | 46 | 6.1% |
| е | 45 | 6.0% |
| н | 39 | 5.2% |
| к | 38 | 5.1% |
| в | 34 | 4.5% |
| с | 31 | 4.1% |
| Other values (38) | 258 |
Common
| Value | Count | Frequency (%) |
| 1270337 | ||
| : | 344604 | 17.6% |
| , | 300060 | 15.3% |
| . | 31072 | 1.6% |
| - | 12362 | 0.6% |
| ' | 2561 | 0.1% |
| & | 698 | < 0.1% |
| 3 | 183 | < 0.1% |
| 151 | < 0.1% | |
| / | 98 | < 0.1% |
| Other values (19) | 155 | < 0.1% |
Arabic
| Value | Count | Frequency (%) |
| ا | 9 | |
| م | 7 | |
| د | 5 | |
| ع | 4 | |
| ي | 4 | |
| ی | 4 | |
| ل | 3 | 5.8% |
| ح | 3 | 5.8% |
| ن | 3 | 5.8% |
| پ | 2 | 3.8% |
| Other values (7) | 8 |
Greek
| Value | Count | Frequency (%) |
| ς | 2 | 11.8% |
| ρ | 2 | 11.8% |
| Φ | 1 | 5.9% |
| ν | 1 | 5.9% |
| α | 1 | 5.9% |
| β | 1 | 5.9% |
| Α | 1 | 5.9% |
| ο | 1 | 5.9% |
| γ | 1 | 5.9% |
| ώ | 1 | 5.9% |
| Other values (5) | 5 |
Han
| Value | Count | Frequency (%) |
| 森 | 1 | 7.7% |
| 杰 | 1 | 7.7% |
| 立 | 1 | 7.7% |
| 张 | 1 | 7.7% |
| 莫 | 1 | 7.7% |
| 玛 | 1 | 7.7% |
| 中 | 1 | 7.7% |
| 村 | 1 | 7.7% |
| 誠 | 1 | 7.7% |
| 義 | 1 | 7.7% |
| Other values (3) | 3 |
Inherited
| Value | Count | Frequency (%) |
| ̃ | 2 | |
| ́ | 2 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10408883 | |
| None | 21602 | 0.2% |
| Cyrillic | 749 | < 0.1% |
| Hangul | 98 | < 0.1% |
| Arabic | 52 | < 0.1% |
| Punctuation | 13 | < 0.1% |
| CJK | 13 | < 0.1% |
| Latin Ext Additional | 5 | < 0.1% |
| Diacriticals | 4 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1270337 | 12.2% | |
| r | 863480 | 8.3% |
| e | 795230 | 7.6% |
| o | 677278 | 6.5% |
| i | 676464 | 6.5% |
| a | 594616 | 5.7% |
| t | 523212 | 5.0% |
| n | 497393 | 4.8% |
| : | 344604 | 3.3% |
| s | 343041 | 3.3% |
| Other values (64) | 3823228 |
None
| Value | Count | Frequency (%) |
| é | 5650 | |
| á | 2484 | |
| í | 1571 | 7.3% |
| ó | 1396 | 6.5% |
| ö | 1264 | 5.9% |
| ô | 1143 | 5.3% |
| è | 679 | 3.1% |
| ü | 671 | 3.1% |
| ç | 617 | 2.9% |
| ä | 594 | 2.7% |
| Other values (106) | 5533 |
Cyrillic
| Value | Count | Frequency (%) |
| и | 86 | 11.5% |
| а | 70 | 9.3% |
| р | 53 | 7.1% |
| о | 49 | 6.5% |
| л | 46 | 6.1% |
| е | 45 | 6.0% |
| н | 39 | 5.2% |
| к | 38 | 5.1% |
| в | 34 | 4.5% |
| с | 31 | 4.1% |
| Other values (38) | 258 |
Arabic
| Value | Count | Frequency (%) |
| ا | 9 | |
| م | 7 | |
| د | 5 | |
| ع | 4 | |
| ي | 4 | |
| ی | 4 | |
| ل | 3 | 5.8% |
| ح | 3 | 5.8% |
| ن | 3 | 5.8% |
| پ | 2 | 3.8% |
| Other values (7) | 8 |
Hangul
| Value | Count | Frequency (%) |
| 이 | 7 | 7.1% |
| 진 | 7 | 7.1% |
| 정 | 5 | 5.1% |
| 모 | 4 | 4.1% |
| 연 | 4 | 4.1% |
| 아 | 4 | 4.1% |
| 영 | 3 | 3.1% |
| 박 | 3 | 3.1% |
| 현 | 3 | 3.1% |
| 성 | 3 | 3.1% |
| Other values (46) | 55 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 6 | |
| – | 3 | |
| “ | 2 | 15.4% |
| ” | 2 | 15.4% |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ễ | 3 | |
| ạ | 1 | 20.0% |
| ấ | 1 | 20.0% |
Diacriticals
| Value | Count | Frequency (%) |
| ̃ | 2 | |
| ́ | 2 |
CJK
| Value | Count | Frequency (%) |
| 森 | 1 | 7.7% |
| 杰 | 1 | 7.7% |
| 立 | 1 | 7.7% |
| 张 | 1 | 7.7% |
| 莫 | 1 | 7.7% |
| 玛 | 1 | 7.7% |
| 中 | 1 | 7.7% |
| 村 | 1 | 7.7% |
| 誠 | 1 | 7.7% |
| 義 | 1 | 7.7% |
| Other values (3) | 3 |
release_year
Real number (ℝ)
| Distinct | 135 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1991.88 |
| Minimum | 1874 |
|---|---|
| Maximum | 2020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 1874 |
|---|---|
| 5-th percentile | 1941 |
| Q1 | 1978 |
| median | 2001 |
| Q3 | 2010 |
| 95-th percentile | 2015 |
| Maximum | 2020 |
| Range | 146 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 24.055565 |
|---|---|
| Coefficient of variation (CV) | 0.012076814 |
| Kurtosis | 0.83912906 |
| Mean | 1991.88 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -1.2245988 |
| Sum | 90363629 |
| Variance | 578.67021 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2014 | 1973 | 4.3% |
| 2015 | 1905 | 4.2% |
| 2013 | 1890 | 4.2% |
| 2012 | 1722 | 3.8% |
| 2011 | 1667 | 3.7% |
| 2016 | 1604 | 3.5% |
| 2009 | 1585 | 3.5% |
| 2010 | 1501 | 3.3% |
| 2008 | 1470 | 3.2% |
| 2007 | 1319 | 2.9% |
| Other values (125) | 28730 |
| Value | Count | Frequency (%) |
| 1874 | 1 | < 0.1% |
| 1878 | 1 | < 0.1% |
| 1883 | 1 | < 0.1% |
| 1887 | 1 | < 0.1% |
| 1888 | 2 | < 0.1% |
| 1890 | 5 | < 0.1% |
| 1891 | 6 | |
| 1892 | 3 | < 0.1% |
| 1893 | 1 | < 0.1% |
| 1894 | 13 |
| Value | Count | Frequency (%) |
| 2020 | 1 | < 0.1% |
| 2018 | 5 | < 0.1% |
| 2017 | 531 | 1.2% |
| 2016 | 1604 | |
| 2015 | 1905 | |
| 2014 | 1973 | |
| 2013 | 1890 | |
| 2012 | 1722 | |
| 2011 | 1667 | |
| 2010 | 1501 |
return
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 1256 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 660.18833 |
| Minimum | 0 |
|---|---|
| Maximum | 12396383 |
| Zeros | 40043 |
| Zeros (%) | 88.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 354.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2.54 |
| Maximum | 12396383 |
| Range | 12396383 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 74701.525 |
|---|---|
| Coefficient of variation (CV) | 113.15184 |
| Kurtosis | 20668.4 |
| Mean | 660.18833 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 138.31428 |
| Sum | 29950104 |
| Variance | 5.5803179 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 40043 | |
| 0.01 | 64 | 0.1% |
| 0.02 | 38 | 0.1% |
| 1 | 34 | 0.1% |
| 0.08 | 29 | 0.1% |
| 0.03 | 27 | 0.1% |
| 0.06 | 27 | 0.1% |
| 1.1 | 26 | 0.1% |
| 0.62 | 25 | 0.1% |
| 1.2 | 23 | 0.1% |
| Other values (1246) | 5030 | 11.1% |
| Value | Count | Frequency (%) |
| 0 | 40043 | |
| 0.01 | 64 | 0.1% |
| 0.02 | 38 | 0.1% |
| 0.03 | 27 | 0.1% |
| 0.04 | 19 | < 0.1% |
| 0.05 | 22 | < 0.1% |
| 0.06 | 27 | 0.1% |
| 0.07 | 18 | < 0.1% |
| 0.08 | 29 | 0.1% |
| 0.09 | 16 | < 0.1% |
| Value | Count | Frequency (%) |
| 12396383 | 1 | |
| 8500000 | 1 | |
| 4197476.62 | 1 | |
| 2755584 | 1 | |
| 1018619.28 | 1 | |
| 1000000 | 1 | |
| 26881.72 | 1 | |
| 12890.39 | 1 | |
| 5330.34 | 1 | |
| 4133.33 | 1 |
| budget | id | popularity | revenue | runtime | vote_average | vote_count | release_year | return | original_language | status | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| budget | 1.000 | -0.255 | 0.463 | 0.645 | 0.227 | 0.072 | 0.484 | 0.141 | 0.771 | 0.000 | 0.000 |
| id | -0.255 | 1.000 | -0.410 | -0.278 | -0.206 | -0.149 | -0.433 | 0.392 | -0.263 | 0.071 | 0.056 |
| popularity | 0.463 | -0.410 | 1.000 | 0.491 | 0.307 | 0.241 | 0.893 | 0.185 | 0.446 | 0.000 | 0.000 |
| revenue | 0.645 | -0.278 | 0.491 | 1.000 | 0.254 | 0.127 | 0.513 | 0.104 | 0.849 | 0.000 | 0.000 |
| runtime | 0.227 | -0.206 | 0.307 | 0.254 | 1.000 | 0.193 | 0.290 | 0.034 | 0.234 | 0.111 | 0.000 |
| vote_average | 0.072 | -0.149 | 0.241 | 0.127 | 0.193 | 1.000 | 0.318 | -0.008 | 0.121 | 0.070 | 0.019 |
| vote_count | 0.484 | -0.433 | 0.893 | 0.513 | 0.290 | 0.318 | 1.000 | 0.197 | 0.473 | 0.000 | 0.000 |
| release_year | 0.141 | 0.392 | 0.185 | 0.104 | 0.034 | -0.008 | 0.197 | 1.000 | 0.085 | 0.144 | 0.028 |
| return | 0.771 | -0.263 | 0.446 | 0.849 | 0.234 | 0.121 | 0.473 | 0.085 | 1.000 | 0.000 | 0.000 |
| original_language | 0.000 | 0.071 | 0.000 | 0.000 | 0.111 | 0.070 | 0.000 | 0.144 | 0.000 | 1.000 | 0.000 |
| status | 0.000 | 0.056 | 0.000 | 0.000 | 0.000 | 0.019 | 0.000 | 0.028 | 0.000 | 0.000 | 1.000 |
| belongs_to_collection | budget | genres | id | original_language | overview | popularity | production_companies | production_countries | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | cast | crew | release_year | return | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Toy Story Collection | 30000000 | Animation, Comedy, Family | 862 | en | Led by Woody, Andy's toys live happily in his room until Andy's birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy's heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences. | 21.95 | Pixar Animation Studios | United States of America | 1995-10-30 | 373554033.0 | 81.0 | English | Released | NaN | Toy Story | 7.7 | 5415 | Tom Hanks, Tim Allen, Don Rickles, Jim Varney, Wallace Shawn, John Ratzenberger, Annie Potts, John Morris, Erik von Detten, Laurie Metcalf, R. Lee Ermey, Sarah Freeman, Penn Jillette | Director: John Lasseter, Screenplay: Alec Sokolow, Producer: Ralph Guggenheim, Executive Producer: Steve Jobs, Editor: Robert Gordon, Art Direction: Ralph Eggleston, Foley Editor: Mary Helen Leasman, Animation: Ken Willard, ADR Editor: Marilyn McCoppen, Orchestrator: Don Davis, Color Timer: Dale E. Grahn, CG Painter: William Cone, Original Story: Andrew Stanton, Post Production Supervisor: Patsy Bouge, Sculptor: Shelley Daniels Lekven, Animation Director: Rich Quade, Music: Randy Newman, Layout: Desirée Mourad, Music Editor: James Flamberg, Negative Cutter: Rick Mackay, Title Designer: Susan Bradley, Supervising Technical Director: William Reeves, Songs: Randy Newman, Supervising Animator: Pete Docter, Sound Designer: Gary Rydstrom, Production Supervisor: Karen Robert Jackson, Executive Music Producer: Chris Montan, Visual Effects Supervisor: Thomas Porter, Visual Effects: Brian M. Rosen, Lighting Supervisor: Galyn Susman, Character Designer: Jean Gillmore, Set Dresser: Ann M. Rockwell, Editorial Manager: Julie M. McDonald, Assistant Editor: Dana Mulligan, Editorial Coordinator: Deirdre Morrison, Production Coordinator: Ellen Devine, Unit Publicist: Lauren Beth Strogoff, Sound Re-Recording Mixer: Gary Summers, Supervising Sound Editor: Tim Holland, Sound Effects Editor: Pat Jackson, Sound Design Assistant: Tom Myers, Assistant Sound Editor: Dan Engstrom, Casting Consultant: Ruth Lambert, ADR Voice Casting: Mickie McGowan | 1995 | 12.45 |
| 1 | NaN | 65000000 | Adventure, Fantasy, Family | 8844 | en | When siblings Judy and Peter discover an enchanted board game that opens the door to a magical world, they unwittingly invite Alan -- an adult who's been trapped inside the game for 26 years -- into their living room. Alan's only hope for freedom is to finish the game, which proves risky as all three find themselves running from giant rhinoceroses, evil monkeys and other terrifying creatures. | 17.02 | TriStar Pictures, Teitler Film, Interscope Communications | United States of America | 1995-12-15 | 262797249.0 | 104.0 | English, Français | Released | Roll the dice and unleash the excitement! | Jumanji | 6.9 | 2413 | Robin Williams, Jonathan Hyde, Kirsten Dunst, Bradley Pierce, Bonnie Hunt, Bebe Neuwirth, David Alan Grier, Patricia Clarkson, Adam Hann-Byrd, Laura Bell Bundy, James Handy, Gillian Barber, Brandon Obray, Cyrus Thiedeke, Gary Joseph Thorup, Leonard Zola, Lloyd Berry, Malcolm Stewart, Annabel Kershaw, Darryl Henriques, Robyn Driscoll, Peter Bryant, Sarah Gilson, Florica Vlad, June Lion, Brenda Lockmuller | Executive Producer: Robert W. Cort, Screenplay: Jim Strain, Original Music Composer: James Horner, Director: Joe Johnston, Editor: Robert Dalva, Casting: Nancy Foy, Animation Supervisor: Kyle Balda, Production Design: James D. Bissell, Producer: William Teitler, Director of Photography: Thomas E. Ackerman, Novel: Chris van Allsburg | 1995 | 4.04 |
| 2 | Grumpy Old Men Collection | 0 | Romance, Comedy | 15602 | en | A family wedding reignites the ancient feud between next-door neighbors and fishing buddies John and Max. Meanwhile, a sultry Italian divorcée opens a restaurant at the local bait shop, alarming the locals who worry she'll scare the fish away. But she's less interested in seafood than she is in cooking up a hot time with Max. | 11.71 | Warner Bros., Lancaster Gate | United States of America | 1995-12-22 | 0.0 | 101.0 | English | Released | Still Yelling. Still Fighting. Still Ready for Love. | Grumpier Old Men | 6.5 | 92 | Walter Matthau, Jack Lemmon, Ann-Margret, Sophia Loren, Daryl Hannah, Burgess Meredith, Kevin Pollak | Director: Howard Deutch, Characters: Mark Steven Johnson, Writer: Mark Steven Johnson, Sound Recordist: Jack Keller | 1995 | 0.00 |
| 3 | NaN | 16000000 | Comedy, Drama, Romance | 31357 | en | Cheated on, mistreated and stepped on, the women are holding their breath, waiting for the elusive "good man" to break a string of less-than-stellar lovers. Friends and confidants Vannah, Bernie, Glo and Robin talk it all out, determined to find a better way to breathe. | 3.86 | Twentieth Century Fox Film Corporation | United States of America | 1995-12-22 | 81452156.0 | 127.0 | English | Released | Friends are the people who let you be yourself... and never let you forget it. | Waiting to Exhale | 6.1 | 34 | Whitney Houston, Angela Bassett, Loretta Devine, Lela Rochon, Gregory Hines, Dennis Haysbert, Michael Beach, Mykelti Williamson, Lamont Johnson, Wesley Snipes | Director: Forest Whitaker, Screenplay: Terry McMillan, Producer: Caron K, Executive Producer: Terry McMillan, Novel: Terry McMillan, Original Music Composer: Kenneth Edmonds | 1995 | 5.09 |
| 4 | Father of the Bride Collection | 0 | Comedy | 11862 | en | Just when George Banks has recovered from his daughter's wedding, he receives the news that she's pregnant ... and that George's wife, Nina, is expecting too. He was planning on selling their home, but that's a plan that -- like George -- will have to change with the arrival of both a grandchild and a kid of his own. | 8.39 | Sandollar Productions, Touchstone Pictures | United States of America | 1995-02-10 | 76578911.0 | 106.0 | English | Released | Just When His World Is Back To Normal... He's In For The Surprise Of His Life! | Father of the Bride Part II | 5.7 | 173 | Steve Martin, Diane Keaton, Martin Short, Kimberly Williams-Paisley, George Newbern, Kieran Culkin, BD Wong, Peter Michael Goetz, Kate McGregor-Stewart, Jane Adams, Eugene Levy, Lori Alan | Original Music Composer: Alan Silvestri, Director of Photography: Elliot Davis, Screenplay: Albert Hackett, Producer: Nancy Meyers, Director: Charles Shyer, Editor: Adam Bernardi | 1995 | 0.00 |
| 5 | NaN | 60000000 | Action, Crime, Drama, Thriller | 949 | en | Obsessive master thief, Neil McCauley leads a top-notch crew on various insane heists throughout Los Angeles while a mentally unstable detective, Vincent Hanna pursues him without rest. Each man recognizes and respects the ability and the dedication of the other even though they are aware their cat-and-mouse game may end in violence. | 17.92 | Regency Enterprises, Forward Pass, Warner Bros. | United States of America | 1995-12-15 | 187436818.0 | 170.0 | English, Español | Released | A Los Angeles Crime Saga | Heat | 7.7 | 1886 | Al Pacino, Robert De Niro, Val Kilmer, Jon Voight, Tom Sizemore, Diane Venora, Amy Brenneman, Ashley Judd, Mykelti Williamson, Natalie Portman, Ted Levine, Tom Noonan, Tone Loc, Hank Azaria, Wes Studi, Dennis Haysbert, Danny Trejo, Henry Rollins, William Fichtner, Kevin Gage, Susan Traylor, Jerry Trimble, Ricky Harris, Jeremy Piven, Xander Berkeley, Begonya Plaza, Rick Avery, Hazelle Goodman, Ray Buktenica, Max Daniels, Vince Deadrick Jr., Steven Ford, Farrah Forke, Patricia Healy, Paul Herman, Cindy Katz, Brian Libby, Dan Martin, Mario Roberts, Thomas Rosales, Jr., Yvonne Zima, Mick Gould, Bud Cort, Viviane Vives, Kim Staunton, Martin Ferrero, Brad Baldridge, Andrew Camuccio, Kenny Endoso, Kimberly Flynn, Niki Harris, Bill McIntosh, Rick Marzan, Terry Miller, Daniel O'Haco, Kai Soremekun, Peter Blackwell, Trevor Coppola, Mary Kircher, Darin Mangan, Robert Miranda, Manny Perry, Iva Franks Singer, Tim Werner, Philip Ettington | Director: Michael Mann, Screenplay: Michael Mann, Producer: Michael Mann, Original Music Composer: Elliot Goldenthal, Director of Photography: Dante Spinotti, Editor: Tom Rolf, Casting: Jane Brody, Production Design: Neil Spisak, Art Direction: Margie Stone McShirley, Costume Design: Deborah Lynn Scott, Music Editor: Michael Connell, Supervising Sound Editor: Larry Kemp, Special Effects Coordinator: Terry D. Frazee, Special Effects: Donald Frazee, Visual Effects Supervisor: Neil Krepela, Stunt Coordinator: Joel Kramer, Stunts: Doug Coleman, Set Decoration: Anne H. Ahrens, Costume Supervisor: Darryl M. Athons, Script Supervisor: Cate Hardman, Art Department Coordinator: Oscar Mazzola, Assistant Art Director: Dianne Wager, Construction Coordinator: Anthony Lattanzio, Assistant Costume Designer: David Le Vey, Hairstylist: Ilona Herman, Key Hair Stylist: Vera Mitchell, Makeup Artist: Ken Diaz, Dialogue Editor: Lauren Stephens, Camera Operator: Gary Jay, Steadicam Operator: James Muro, Still Photographer: Frank Connor, First Assistant Camera: Chris Moseley, Rigging Gaffer: Frank Dorowsky, Music Supervisor: Budd Carr, First Assistant Editor: Ray Boniker, Sound Re-Recording Mixer: Mark Smith, Technical Supervisor: Mick Gould, Executive Producer: Arnon Milchan, Associate Producer: Gusmano Cesaretti, Unit Production Manager: Christopher Cronyn, Assistant Director: Michael Waxman, Casting Associate: Alison E. McBryde, Set Costumer: Marsha Bozeman, Digital Effects Supervisor: Jeff Wells, Sound Recordist: Philip Rogers, Additional Soundtrack: Jimmy Webb | 1995 | 3.12 |
| 6 | NaN | 58000000 | Comedy, Romance | 11860 | en | An ugly duckling having undergone a remarkable change, still harbors feelings for her crush: a carefree playboy, but not before his business-focused brother has something to say about it. | 6.68 | Paramount Pictures, Scott Rudin Productions, Mirage Enterprises, Sandollar Productions, Constellation Entertainment, Worldwide, Mont Blanc Entertainment GmbH | Germany, United States of America | 1995-12-15 | 0.0 | 127.0 | Français, English | Released | You are cordially invited to the most surprising merger of the year. | Sabrina | 6.2 | 141 | Harrison Ford, Julia Ormond, Greg Kinnear, Angie Dickinson, Nancy Marchand, John Wood, Richard Crenna, Lauren Holly, Dana Ivey, Fanny Ardant, Patrick Bruel, Paul Giamatti, Miriam Colón, Elizabeth Franz, Valérie Lemercier, Becky Ann Baker, John C. Vennema, Margo Martindale, J. Smith-Cameron, Christine Luneau-Lipton, Michael Dees, Denis Holmes, Jo-Jo Lowe, Ira Wheeler, Philippa Cooper, Ayako Kawahara, François Genty, Guillaume Gallienne, Inés Sastre, Phina Oruche, Andrea Behalikova, Jennifer Herrera, Kristina Kumlin, Eva Linderholm, Carmen Chaplin, Micheline Van de Velde, Joanna Rhodes, Alan Boone, Patrick Forster-Delmas, Kentaro Matsuo, Peter McKernan, Ed Connelly, Ronald L. Schwary, Alvin Lum, Siching Song, Phil Nee, Randy Becker, Susan Browning, Anthony Mondal, Peter Parks, Woodrow Asai, Eric Bruno Borgman, Michael Cline, Christopher Del Gaudio, Philippe Hartmann, Jerry Quinn, Dori Rosenthal | Director: Sydney Pollack, Screenplay: David Rayfiel, Producer: Scott Rudin, Original Music Composer: John Williams, Editor: Fredric Steinkamp, Casting: David Rubin, Production Design: Brian Morris, Makeup Artist: Joseph A. Campayno, Hairstylist: Stephen G. Bishop, Co-Costume Designer: Gary Jones, Costume Design: Ann Roth, Set Decoration: Amy Marshall, Art Department Coordinator: Miriam Schapiro, Sound mixer: Danny Michael, Sound Re-Recording Mixer: Scott Millan, Supervising Sound Effects Editor: Myron Nettinga, Sound Effects Editor: Joe Earle, Supervising Sound Editor: J. Paul Huntsman, Boom Operator: Andrew Schmetterling, Dialogue Editor: Benjamin Beardwood, Script Supervisor: Mary A. Kelly, Still Photographer: Brian Hamill, Camera Operator: Giovanni Fiore Coltellacci, Director of Photography: Giuseppe Rotunno, Casting Associate: Ronna Kress, Assistant Costume Designer: Michelle Matland, Costume Supervisor: Donna Maloney, First Assistant Editor: Karl F. Steinkamp, Executive Producer: Ronald L. Schwary, Art Direction: John Kasarda, Production Manager: Ronald L. Schwary, Production Supervisor: Thomas A. Imperato, Casting Assistant: Bill Kaufman, Location Manager: Joseph E. Iberti, Production Coordinator: Katherine Kennedy | 1995 | 0.00 |
| 7 | NaN | 0 | Action, Adventure, Drama, Family | 45325 | en | A mischievous young boy, Tom Sawyer, witnesses a murder by the deadly Injun Joe. Tom becomes friends with Huckleberry Finn, a boy with no future and no family. Tom has to choose between honoring a friendship or honoring an oath because the town alcoholic is accused of the murder. Tom and Huck go through several adventures trying to retrieve evidence. | 2.56 | Walt Disney Pictures | United States of America | 1995-12-22 | 0.0 | 97.0 | English, Deutsch | Released | The Original Bad Boys. | Tom and Huck | 5.4 | 45 | Jonathan Taylor Thomas, Brad Renfro, Rachael Leigh Cook, Michael McShane, Amy Wright, Eric Schweig, Tamara Mello | Screenplay: Stephen Sommers, Director: Peter Hewitt, Novel: Mark Twain | 1995 | 0.00 |
| 8 | NaN | 35000000 | Action, Adventure, Thriller | 9091 | en | International action superstar Jean Claude Van Damme teams with Powers Boothe in a Tension-packed, suspense thriller, set against the back-drop of a Stanley Cup game.Van Damme portrays a father whose daughter is suddenly taken during a championship hockey game. With the captors demanding a billion dollars by game's end, Van Damme frantically sets a plan in motion to rescue his daughter and abort an impending explosion before the final buzzer... | 5.23 | Universal Pictures, Imperial Entertainment, Signature Entertainment | United States of America | 1995-12-22 | 64350171.0 | 106.0 | English | Released | Terror goes into overtime. | Sudden Death | 5.5 | 174 | Jean-Claude Van Damme, Powers Boothe, Dorian Harewood, Raymond J. Barry, Ross Malinger, Whittni Wright | Director: Peter Hyams, Screenplay: Gene Quintano, Producer: Howard Baldwin, Music: John Debney, Director of Photography: Peter Hyams, Editor: Steven Kemper | 1995 | 1.84 |
| 9 | James Bond Collection | 58000000 | Adventure, Action, Thriller | 710 | en | James Bond must unmask the mysterious head of the Janus Syndicate and prevent the leader from utilizing the GoldenEye weapons system to inflict devastating revenge on Britain. | 14.69 | United Artists, Eon Productions | United Kingdom, United States of America | 1995-11-16 | 352194034.0 | 130.0 | English, Pусский, Español | Released | No limits. No fears. No substitutes. | GoldenEye | 6.6 | 1194 | Pierce Brosnan, Sean Bean, Izabella Scorupco, Famke Janssen, Joe Don Baker, Judi Dench, Gottfried John, Robbie Coltrane, Alan Cumming, Tchéky Karyo, Desmond Llewelyn, Samantha Bond, Michael Kitchen, Serena Gordon, Simon Kunz, Billy J. Mitchell, Constantine Gregory, Minnie Driver, Michelle Arthur, Ravil Isyanov | Director: Martin Campbell, Characters: Ian Fleming, Screenplay: Bruce Feirstein, Producer: Anthony Waye, Executive Producer: Tom Pevsner, Original Music Composer: Eric Serra, Songs: Tina Turner, Director of Photography: Phil Meheux, Editor: Terry Rawlings, Casting: Pam Dixon, Production Design: Peter Lamont, Art Direction: Charles Dwight Lee, Set Decoration: Michael Ford, Costume Design: Lindy Hemming, Story: Michael France, Assistant Art Director: Steven Lawrence, Construction Coordinator: Tony Graysmark, Supervising Art Director: Neil Lamont, Music Editor: Robert Hathaway, Armorer: Charles Bodycomb, Script Supervisor: June Randall, Camera Operator: Tim Wooster, Still Photographer: George Whitear, Gaffer: Steve Foster, Special Effects Supervisor: Chris Corbould, Visual Effects Coordinator: Mara Bryan, Visual Effects Editor: Tim Grover, Dialogue Editor: Peter Musgrave, Sound Re-Recording Mixer: John Hayward, Supervising Sound Editor: Jim Shields, Sound Recordist: David John | 1995 | 6.07 |
| belongs_to_collection | budget | genres | id | original_language | overview | popularity | production_companies | production_countries | release_date | revenue | runtime | spoken_languages | status | tagline | title | vote_average | vote_count | cast | crew | release_year | return | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 45356 | NaN | 0 | NaN | 67179 | it | Sentenced to life imprisonment for illegal activities, Italian International member Giulio Manieri holds on to his political ideals while struggling against madness in the loneliness of his prison cell. | 0.23 | NaN | NaN | 1972-01-01 | 0.0 | 90.0 | Italiano | Released | NaN | St. Michael Had a Rooster | 6.0 | 3 | Giulio Brogi, Renato Cestiè, Vito Cipolla, Daniele Dublino | Novel: Leo Tolstoy, Screenplay: Paolo Taviani, Director: Vittorio Taviani | 1972 | 0.0 |
| 45357 | NaN | 0 | Horror, Mystery, Thriller | 84419 | en | An unsuccessful sculptor saves a madman named "The Creeper" from drowning. Seeing an opportunity for revenge, he tricks the psycho into murdering his critics. | 0.22 | Universal Pictures | United States of America | 1946-03-29 | 0.0 | 65.0 | English | Released | Meet...The CREEPER! | House of Horrors | 6.3 | 8 | Rondo Hatton, Robert Lowery, Virginia Grey, Bill Goodwin, Martin Kosleck, Alan Napier, Howard Freeman, Virginia Christine, Joan Shawlee, Byron Foulger, Syd Saylor | Set Decoration: Ralph Warrington, Art Direction: Abraham Grossman, Makeup Artist: Jack P. Pierce, Editor: Philip Cahn, Director: Jean Yarbrough, Screenplay: George Bricker, Director of Photography: Maury Gertsman, Original Story: Dwight V. Babcock, Producer: Ben Pivar | 1946 | 0.0 |
| 45358 | NaN | 0 | Mystery, Horror | 390959 | en | In this true-crime documentary, we delve into the murder spree that was the inspiration for Joe Berlinger's "Book of Shadows: Blair Witch 2". | 0.08 | NaN | NaN | 2000-10-22 | 0.0 | 45.0 | English | Released | NaN | Shadow of the Blair Witch | 7.0 | 2 | Tony Abatemarco, Andre Brooks, Mariclare Costello, Bill Dreggors, Apollo Dukakis, Philip Friedman, James Gleason, Dilva Henry, Bari Hochwald, Wendy Hoffman, John Huck, Rachel Moskowitz, Sandy Mulvihill, Roger Nolan, Chris Parnell, Byrne Piven, Richard Sexton, Rich Williams, Ray Xifo | Director: Ben Rock, Writer: Ben Rock, Producer: Ben Rock, Executive Producer: Pirie Jones, Line Producer: Kimberly Rach, Original Music Composer: Sasha Bogdanowitsch, Cinematography: Neal Fredericks, Editor: George Rizkallah, Casting: David Giella, Production Design: Steven P. Duchscherer, Art Direction: Chris Davis, Makeup Department Head: Kimberly Eckhout, Makeup Artist: Hillary Wallace, Hairstylist: Hillary Wallace, Hair Department Head: Kimberly Eckhout, Assistant Director: Aaron Walters, Art Department Coordinator: Shaun Richkind, Sound Designer: Jeremy M. Gilleece, Sound Mixer: Jeremy M. Gilleece, Boom Operator: Jackson Hilliard, Still Photographer: James Grossman, Gaffer: Dale Obert, Costume Design: Ann Roth | 2000 | 0.0 |
| 45359 | NaN | 0 | Horror | 289923 | en | A film archivist revisits the story of Rustin Parr, a hermit thought to have murdered seven children while under the possession of the Blair Witch. | 0.39 | Neptune Salad Entertainment, Pirie Productions | United States of America | 2000-10-03 | 0.0 | 30.0 | English | Released | Do you know what happened 50 years before "The Blair Witch Project"? | The Burkittsville 7 | 7.0 | 1 | Monty Bane, Lucy Butler, David Grammer, Bill Dreggors, Frank Pastor, Heather Donahue, Joshua Leonard, Michael C. Williams | Director: Ben Rock, Writer: Ben Rock | 2000 | 0.0 |
| 45360 | NaN | 0 | Science Fiction | 222848 | en | It's the year 3000 AD. The world's most dangerous women are banished to a remote asteroid 45 million light years from earth. Kira Murphy doesn't belong; wrongfully accused of a crime she did not commit, she's thrown in this interplanetary prison and left to her own defenses. But Kira's a fighter, and soon she finds herself in the middle of a female gang war; where everyone wants a piece of the action... and a piece of her! "Caged Heat 3000" takes the Women-in-Prison genre to a whole new level... and a whole new galaxy! | 0.66 | Concorde-New Horizons | United States of America | 1995-01-01 | 0.0 | 85.0 | English | Released | NaN | Caged Heat 3000 | 3.5 | 1 | Lisa Boyle, Kena Land, Zaneta Polard, Don Yanan, Debra K. Beatty, Mark Sikes, Robert J. Ferrelli, Ellyn Dawn Humphreys, Ron Jeremy, Ben Ramsey | Executive Producer: Mike Elliott, Director: Aaron Osborne, Producer: Mike Upton, Writer: Emile Dupont, Editor: Felix Chamberlain | 1995 | 0.0 |
| 45361 | NaN | 0 | Drama, Action, Romance | 30840 | en | Yet another version of the classic epic, with enough variation to make it interesting. The story is the same, but some of the characters are quite different from the usual, in particular Uma Thurman's very special maid Marian. The photography is also great, giving the story a somewhat darker tone. | 5.68 | Westdeutscher Rundfunk (WDR), Working Title Films, 20th Century Fox Television, CanWest Global Communications | Canada, Germany, United Kingdom, United States of America | 1991-05-13 | 0.0 | 104.0 | English | Released | NaN | Robin Hood | 5.7 | 26 | Patrick Bergin, Uma Thurman, David Morrissey, Jürgen Prochnow, Jeroen Krabbé | Director: John Irvin, Writer: John McGrath, Story: Sam Resnick, Producer: Sarah Radclyffe, Music: Geoffrey Burgon, Director of Photography: Jason Lehel, Editor: Peter Tanner, Casting: Susie Figgis | 1991 | 0.0 |
| 45362 | NaN | 0 | Drama | 111109 | tl | An artist struggles to finish his work while a storyline about a cult plays in his head. | 0.18 | Sine Olivia | Philippines | 2011-11-17 | 0.0 | 360.0 | NaN | Released | NaN | Century of Birthing | 9.0 | 3 | Angel Aquino, Perry Dizon, Hazel Orencio, Joel Torre, Bart Guingona, Soliman Cruz , Roeder, Angeli Bayani, Dante Perez, Betty Uy-Regala, Modesta | Director: Lav Diaz, Writer: Lav Diaz, Production Design: Dante Perez, Music: Lav Diaz, Editor: Lav Diaz, Cinematography: Lav Diaz | 2011 | 0.0 |
| 45363 | NaN | 0 | Action, Drama, Thriller | 67758 | en | When one of her hits goes wrong, a professional assassin ends up with a suitcase full of a million dollars belonging to a mob boss ... | 0.90 | American World Pictures | United States of America | 2003-08-01 | 0.0 | 90.0 | English | Released | A deadly game of wits. | Betrayal | 3.8 | 6 | Erika Eleniak, Adam Baldwin, Julie du Page, James Remar, Damian Chapa, Louis Mandylor, Tom Wright, Jeremy Lelliott, James Quattrochi, Jason Widener, Joe Sabatino, Kiko Ellsworth, Don Swayze, Peter Dobson, Darrell Dubovsky | Director: Mark L. Lester, Screenplay: Jeffrey Goldenberg, Original Music Composer: Richard McHugh, Director of Photography: João Fernandes | 2003 | 0.0 |
| 45364 | NaN | 0 | NaN | 227506 | en | In a small town live two brothers, one a minister and the other one a hunchback painter of the chapel who lives with his wife. One dreadful and stormy night, a stranger knocks at the door asking for shelter. The stranger talks about all the good things of the earthly life the minister is missing because of his puritanical faith. The minister comes to accept the stranger's viewpoint but it is others who will pay the consequences because the minister will discover the human pleasures thanks to, ehem, his sister- in -law… The tormented minister and his cuckolded brother will die in a strange accident in the chapel and later an infant will be born from the minister's adulterous relationship. | 0.00 | Yermoliev | Russia | 1917-10-21 | 0.0 | 87.0 | NaN | Released | NaN | Satan Triumphant | 0.0 | 0 | Iwan Mosschuchin, Nathalie Lissenko, Pavel Pavlov, Aleksandr Chabrov, Vera Orlova | Director: Yakov Protazanov, Producer: Joseph N. Ermolieff | 1917 | 0.0 |
| 45365 | NaN | 0 | NaN | 461257 | en | 50 years after decriminalisation of homosexuality in the UK, director Daisy Asquith mines the jewels of the BFI archive to take us into the relationships, desires, fears and expressions of gay men and women in the 20th century. | 0.16 | NaN | United Kingdom | 2017-06-09 | 0.0 | 75.0 | English | Released | NaN | Queerama | 0.0 | 0 | NaN | Director: Daisy Asquith | 2017 | 0.0 |